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Abstract 


This  report  describes  the  current  status  of  the  dssign 
and  implementation  of  ZETA  --a  prototype  relational  data  base 
management  system.  Zeta  is  composed  of  three  principal  levels. 
The  lowest  level,  MINIZ,  provides  primitive  facilities  for 
representing  and  manipulating  single  relations  on  a  tuple 
oriented  basis.  The  intermediate  level,  the  EXECUTOR,  uses  the 
facilities  of  MINIZ  to  provide  a  higher,  relation  oriented  view 
of  the  data  base.  The  language  facilities  are  provided  at  the 
highest  level  in  order  to  permit  access  to  the  data  base  through 
a  query  language  facility  and  a  programmable  interface. 
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''  •  ^  Ihe  zzih  EI2 

In  early  19i3  a  group  was  formed  in  the  Department  of 
Compu-^er  Science  ^.o  study  data  base  management  systems.  Early 
reasearch  focused  on  -^he  relational  model  of  data  proposed  by  F. 
F.  Sodd  [Codd  1 P'’ 0- 1 9"^ ,  10-15  ]  because  it  provided  attractive 
user- oriented  properties  while  leaving  all  the  efficiency  details 
to  be  decided  by  the  support  system.  In  order  to  understand  the 
problem  of  relational  implementation,  we  decided  to  focus  our 
atten'^-ion  on  building  a  system. 

P.  rela'^ion  can  be  viewed  conceptually  as  a  table  of 
data.  The  tablets  heading  defines  the  relation  name,  the  column 
headings  are  the  at-^-ribu-^e  names  and  each  row  corresponds  to  an 
n-tuple  of  data  values  describing  a  single  value.  The  set  of 
values  which  can  be  used  in  a  column  is  called  a  domain.  A 
relational  data  base  is  composed  of  a  set  of  time  varying 
relations  inter-related  through  common  domains. 

First,  a  preliminary  design  [Brodie  1973,  3]  was 
explored.  '^his  lead  a  larger  design  and  implementation  effort 
called  the  ZFTA  proiect.  More  details  can  be  found  in  | Chan 
1914,  6],  [C^arnik  19F4,  16]  and  [Leong  197U,  24].  This  report 
is  a  summary  of  that  research. 

'^he  primary  goals  of  the  ZFTA.  project  are  the  desigi  and 
implementation  of  a  relational  DBMS.  Much  of  the  past  and 
current  research  in-*:o  the  rela  +  ional  model  deals  with  its  high 


o 


level  logical  capaMliti<^s  which  are  considerable  ’  lodd  and  Dahe 
1 1 .  However,  *he  success  of  ihe  relational  approach  lies 
wi‘h  the  existences  of  an  effici^n*  implementation.  ZHTl  is  to 
Drovide  a  -^rrmeworV  for  i  mpl^m^nta t ion  research  and  for 

applications  dev^^lcpm^n*:  .  for  this  last  reason  ZETA  also 
concerns  itself  vi*:!  user  in'^erfaces. 

ZET?^.  has  two  ma-ior  goals: 

1.  To  inves-^igate  practical  aspects  of  a  relational 

i mplementa r ion ,  specifically:  relational  representations, 
relational  operators,  system  configurations  and  search 
mechan isms, 

2.  To  investigate  user  interfaces,  specifically: 

a.  -^o  desian  and  implement  a  query  language  generating 
system,  and 

b.  to  design  and  implement  a  host  programming  language 
i  n.-^er  f  ac^^ . 

^  •  2  Ihe  ZETI.  System 

ZTTA  is  composed  of  three  principal  levels:  MINIZ,  the 
■^XETITOP  and  the  language  facilities.  P.  brief  description  of  the 
l'=vels  and  their  functions  follows. 

WINIZ  is  the  lowest  level  of  the  ZETA.  system.  It 
provides  “^he  building  blocks  for  the  set  oriented  views  of  the 
higher  levels.  Pelaiions  at  this  level  are  implemented  directly 
as  files  of  da-^a  together  with  operations  on  them.  There  are 
data  defini-^ion  and  data  manipulation  primitives.  The 
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manipula-^  ion  facility  enables  ihe  "marking”  of  a  single  relation 
so  that  retrievals  and  modif ications  can  be  accomplished  on 
random  tuples.  also  returns  information  from  various 

schemas.  The  in’^ermedia*^. e  level  of  ZETA  uses  these  facilities  as 
primitives  for  i'^. s  operation. 

■^he  EXFCTT'^OF  is  •^he  intermediate  level  of  ZETA..  It 
provides  a  higher  rela'^ional  view  of  the  da-^.a  base.  It  offers  as 
primitives  to  “^he  higher  language  level  mulitple  relation, 
multiple  pass  opera'^ions.  These  operations  are  analyzed  and 
translated  in*^o  the  lower  level  primitives  of  MINIZ.  The 
EXECUTOR  directs  the  execution  of  -^he  operations  through  MINIZ 
primi-^ives.  The  two  main  features  of  the  EXECUTOR  are  the 
support  of  multiple  rela-^ion  queries  and  the  ability  no  derive 
new  rela-^-.ions  using  "snapsho-^s"  --a  form  of  user  workspace. 

The  language  level  uses  the  intermediate  level 
primitives  to  provide  the  ZE’TJl  user  interfaces.  It  provides  two 
modes  of  access:  a  programmable  interface  through  a  host 
programming  language  and  a  self-contained  problem  oriented 
in-^erface  through  a  auery  language  generating  sysnem. 

The  three  levels  of  ZE"’?v  are  described  separately;  MINIZ 
in  chapter  2,  -^he  FX^CUiOR  in  chapter  3  and  the  language 
facilities  in  chapter 
and  future  plans  of  Z'E'tA. 


Chapter  f  describes  the  current  state 
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CHA.PTEP  2 

I^~Nr7:  5asic  Relational  System 

2.1  Overview  of  filNIZ 

MINI7  :s  the  lowest  level  of  the  ZE1A  sysrem.  It 
Drovic^es  -^he  building  Mochs  for  the  set  oriented  views  of  the 
higher  levels.  Pelations  at  this  level  are  i  uple.nen  ted  directly 
as  files  of  data  toge-^her  wi-^h  operations  on  the^n.  There  are 
da*a  definition  and  da^^a  manipulation  primitives.  The 
manipulation  facility  enables  single  pass,  single  relation 
r<^trievals  and  modifications  on  one  tuple  at  a  time.  MINIZ  also 
returns  information  from  various  schemas.  The  intermediate  level 
of  ?PT?t  uses  these  facilities  as  primitives  for  its  operation. 

MINT7  executes  as  a  batch  job  on  IBM  360  or  370 
machines  with  OS/h'VT  and  the  PL/I  Optimizer  Compiler.  Currently, 
only  a  single  user  program  may  execute  at  a  time,  however, 
primitive  locking  mechanisms  are  provided  for  future  extensions 
to  multiple  users. 

Interaction  with  this  Icw-level  system  may  be  in  one  of 
two  modes:  data  base  administrator  (DBA)  or  user  mode.  The  DBA 
deals  with  the  technical  aspects  of  the  system.  The  DB^  has 
complete  access  to  the  system  whereas  the  user  is  restricted  by 
the  commands  to  be  executed  and  by  the  relation  which  may  be 
accessed . 

This  chapter  describes  WINIZ,  the  low  level  system.  The 
concept  of  a  MAPK  is  introduced.  The  environment  and  facilities 
of  'ITViz  are  outlined.  The  commands  for  *he  users  of  this  low 
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implementation  features 


level  are  describe!  and,  finally, 
are  presented.  T  knowledge  of  relations  is  assumed. 

2*  'I  •  kings 

Only  primary  relations  are  representable  at  this  level 
of  the  system.  relations  are  implemented  as  ordinary 

files  where  ^ach  record  contains  the  data  fcr  one  tuple. 
A.ccess  to  tuples  is  accomplished  by  first  "marking"  a  primitive 
rela 1 1  on . 

A  MABK  is  the  ^asic  mechanism  to  provide  access  to 
subse-^s  of  da  +  a.  A  MtPK  corresponds  to  a  unary  relation  which 
stores  indices  of  "tuples  in  a  primary  relation  satisfying  a 
certain  Boolean  condition  on  i+s  domains.  A  primary  relation 
wi^h  such  a  derived  relation  on  it  is  termed  marked,  and  the 
result  a  mark  or  marking.  WAPKs  may  be  constructed  either  on  a 
primary  relation  or  on  another  mark. 

The  unary  relation  is  simply  an  array  of  indices  which 
are  rela-^ed  in  two  ways.  First,  the  indices  point  to  tuples  in 
the  same  primary  relation.  Second,  the  tuples  referenced  satisfy 
the  same  condition  upon  domain  values  at  the  insrant  the  mark  was 
created . 

At  “^he  applications  level,  the  user  may  create  or 
destroy  marks.  Tn  addition,  the  facility  to  retrieve  the  indices 
stored  in  the  mark  is  allowed.  Although  modification  of  indices 
is  allowed  below  the  appl  icat  :■  ons  in'^erface,  no  such  operations 
are  currently  permitted  at  that  level  or  above  it. 


These  u  r.ar  v 


re  l-a-^  ions 


are  known  as  basic  elements. 


basic  element  corsis-^s  of  a  snd  a  body.  The  header 

cor-rains  control  information,  while  rhe  body  conxains  the  tuples 
she  unary  rela'^ior. .  hereafter,  tuples  in  the  basic  elements 
will  be  referred  -^o  as  slots  in  the  body. 

Several  standard  operations  are  permitted  upon  basic 
elements.  They  may  be  created  or  destroyed;  bodies  may  be 
increased  or  decreased  in  size.  Values  in  slots  may  be  read. 


in  se  r  t  ed 

or 

changed. 

Facil 

i^ies  exist  for  finding  which 

slots 

contain 

a 

certain 

value 

(i.e.,  like  marking  on  the 

unary 

relation) 

• 

Basic  elements 

may  also  be  linked  together. 

The  full  power  of  the  basic  element  concept  is  not 
utilized  in  this  version  of  the  system.  However,  the  primitives 
for  experimenting  with  new  ways  of  implementing  joins, 
projections  and  res-^r  ic-^  ions  exist  [Leong  1974,  24]. 

A  rela-^ion  may  be  viewed  as  a  time-varying  data  entity 
in  the  data  base.  Hence,  a  mark  can  be  considered  to  be  an 
indirect  referencing  mechanism  for  a  relation  at  some  point  in 
time.  However,  when  modif  ica-^  ions  on  the  data  base  occur  the  mark 
may  not  reflec-^  the  changes.  Therefore,  users  or  the  higher 
level  system  mus*^  re-execute  the  mark  if  it  is  to  be  a  current 
view  of  the  data  base. 

Marks  provide  fast  access  to  subsets  of  data  and  reduce 
“he  storage  requirement  by  eliminating  data  duplication.  On  the 
o“her  hand,  primary  relations  are  perfectly  current.  These 
properties  must  be  weighed  in  choosing  the  best  combination  of 
primary  relations  and  marks  for  each  data  base  application. 
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2.1.2  The 

The  schema  describes  the-  translation  of  the  user's 
logical  view  of  data  in  the  system  into  the  physical 
representation.  "^his  function  is  :^ulfilled  by  the  relation  table 
and  domain  table  +oge+her  with  the  user  table.  h  master  relation 
relates  primary  rela-^ions  to  their  physical  files.  Each  of  these 
‘tables  may  themselves  be  considered  relations  and  are  accessed  as 
such  from  the  higher  levels. 

RELA.TION  ;Tab]_e 

The  relation  table  contains  an  entry  for  each  primary  or 
derived  rela+-ion  in  the  sys+:em.  Each  tuple  has  the  follawing 
domains: 

P EL?lTION_NAWE :  The  relation  name. 

bOM^-TN:  Domain  names  of  the  relation.  Indices  to  the  domain 

*:able  are  given. 

TYPE:  The  type  of  relation:  primary  or  MARK. 

R  ELA.TION_INDFy :  The  index  of  the  relation  in  the  master 
relation  or,  for  NA.RKs,  the  basic  element  index. 

R FLA.TION_WiD';^H:  The  width  of  the  relation  as  stored  in  the 

data  base. 

SIZE:  The  number  of  tuples  in  the  relation. 

DOMATy  Table 

•^he  domain  table  cof^ains  one  entry  for  each  primary 
relation  domain  :  n  “^.he  sys-^em.  A  domain  in  a  marking  has  the 
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sam?  schema  er.-^ry  as  the  rela-^ion  upor.  which  ir  is  based.  !?ach 
*upl9  has  the  following  domains: 

The  domain  name. 

INVFP'^:  Th'="  inver‘*:e(^  list  index  for  this  domain,  if  any. 

PFLA.TION:  The  indices  of  primary  and  derived  relations  usina 

this  domain. 

DTSPLACEW^NT:  The  byte  displacement  of  the  domain  from  the 

start  of  a  tuple. 

nCM?^.IN__WTPTH:  The  byte  width  of  the  domain  as  stored  in  the 

relation . 

DOMA-TN^INPiEX:  The  index  into  the  master  relaticn  of  the  file 

containing  this  domain. 

TYPE:  The  type  of  the  domain,  namely:  bit,  character,  fixed 

decimal,  fixed  binary,  float  decimal,  float  binary,  or  date. 

LOCK:  P.  field  used  for  locking  relation  domains  tc  prohibit 

0‘^her  users  from  accessing  the  relation. 

TJSEP  lable 

The  user  table  contains  one  or  more  entries,  linked 
together,  for  each  user  in  thp  system.  Both  the  PA.SSWOPD  and  LOCK 
code  of  the  user  are  present  in  each  tuple  entry.  The  indices  of 
relations,  PELATION(I)  .INDEX,  which  the  user  may  access  are 
lis-ed  along  wi-^h  a  bit,  EOLATION  (I)  .  CAP  ABILITY,  giving  the  user 
ei~her  read-wri'^e  or  rpad-only  access  to  the  relation. 
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E'ach  of  the  above  relations  also  has  a  field,  A.CCESSES,  to 
store  the  number  of  -ruple  accesses  to  each  schema  relation. 

One  morJt-or  (SYSTEfi-T?--BLES)  con-^.rols  all  access  to  these 
schema  tables.  Pegues^s  o  insert,  delete,  read  or  update  any 
relation  go  through  this  schema  monitor.  All  schema  changes  are 
performed  by  the  da^^a  base  system.  At  higher  levels,  the  user  may 
only  read  the  schema. 

The  master  relation  contains  an  entry  for  each  primary 
relation  file.  '^"he  user  has  no  access  to  this  system  relation; 
only  the  file  system  may  access  it.  Each  tuple  has  the  following 
doma ins: 

PDNAME:  The  name  of  the  relation.  Also  the  DDNAME  on  the 

user-supplied  DD  card  defining  the  relation  data  set. 

LOG_PECL:  The  by^e  length  of  a  relation  tuple. 

M AX__#_IOG_PECS :  The  maximum  number  of  tuples  allowed  in  the 

relation. 

PEAD_INDEX:  Index  of  last  record  read  from  the  file. 

WPITE_INrEX:  Index  of  last  record  written  to  the  file. 

2.1.3  Privacy,  Security  and  Integrity 

This  system  does  not  attempt  to  solve  these  problems. 
The  current  version  provides  privacy  and  security  in  the  sense 
that  no  user  o-^her  ■'■han  the  DBA  and  the  creator  of  a  primary  or 


^er:.vc-=i  relation  inay  access  Commands  could  be  included  in 
’^.INIZ  to  provide  manipulation  of  access  capabilities. 

The  curren^  locking  mechanism  implemented  is  at  the 
relation  level  (i.e.,  each  domain  of  the  relation  is  locked).  h 
lock  on  a  rel?-^ior.  essentially  op'^ns  the  primary  file  for  the 
various  operations  which  follow  and  prevents  others  from 
accessing  the  relation  or  *he  markings  based  upon  it.  Locking  is 
currently  a  feature  used  -^o  eliminate  the  overhead  of  implicit 
locks  for  each  command  processing  the  locked  relation  (s)  . 

A  level  of  data  integrity  is  guaranteed  by  the  current 
implementation.  Only  one  user  execu-^es  at  a  time;  only  user- 
direoted  commands  affect  data  on  disc,  and  only  relations  created 
by  “he  user  are  accessible  to  the  user.  The  user  can  protect 
himself  from  himself  by  creating  backup  relations  or  by  saving 
the  primary  relation  files  on  disc  or  tape  files. 

2 • 2  2 perat ion s 

This  section  describes  all  commands  accepted  b/  the 
basic  relational  da-^a  base  system.  ?^.pplication  programs  may  use 
ei-^her  the  FL/I  Optimizer  or  Checkout  compilers.  Commands  in  the 
application  program  are  represented  as  procedure  calls  to  the 
driving  procedure,  WINIZ. 

Parameter  passing  is  by  reference  and  no  values  passed 
are  changed.  Command  syntax  is  rigid  at  this  primitive  level, 
■■^ailare  to  adhere  to  conventions  may  lead  to  obscure  errors  or  to 
invalid  results.  Tach  command  call  has  -^he  following  format: 


1 1 

C?^LL  MINIZ  {?--CTION,  PELATTON,  DOMAIN,  TID, 

QUAJ.IPICAA’ION,  MPSSA.GE_INFO)  ; 

E§.I§.E§.i§:I§ 

action  is  the  cominand  naire. 

EFLAPION  is  an  array  of  names  denoting  relations  or  marks  used 
in  the  command.  Delation  and  mark  names  must  be  unique,  nonblank 
and  contain  only  alphanumeric  or  National  characters  (i.e.  S,  $, 
or  #)  .  The  firs“^  character  must  be  an  alphabetic. 

DOMAIN  is  a  structure  array  of  nonblank  8  character  domain  names 
and  of  their  associated  values  or  descriptions.  Domain  names 
within  a  relation  must  be  unique.  Relations  are  restricted  to  no 
mora  than  12  domains. 

TID  is  an  array  of  two  in-^.egers  for  tuple  identifiers  (i.e.,  0- 
origin  indices  to  tuples  in  a  relation  or  a  mark) .  The  CREATE 
and  CFEATE-USEP  commands  have  special  uses  for  this  parameter. 

^  pointer  to  a  Boolean  condition  which  is 
represented  as  a  binary  tree  of  conditions  on  domains  or  on  names 
of  marks. 

MESSAGE_INEO  passes  user  and  costing  information  ro  the  system 
and  calculates  the  cos*:s  of  operations  and  the  number  of  tuples 
in  a  relation.  For  each  command,  MESSAGE_INEO  must  give:  a  DBA*- 
assigned  user  code;  a  user-chosen  password  and  a  value,  Y  or  N 
for  the  cost  FUNCTION  switch.  Y  for  FUNCTION  initiates  the 
costing  scheme  and  the  system  returns  a  value  indicating  the 
expense  of  a  command  without  executing  it.  Cost  values  start  at 
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0;  low  values  indicate  low  costs.  The  cost  is  in  terms  of 
execution  time.  N  inhibits  the  costing  scheme;  hence,  the 
command  is  execu-'-ed. 

■^or  <=ach  command,  MFSS?-Gh_ZNFO  returns  a  message  and  its 
system  code.  ?^.lso,  when  marking  a  relation  or  mark,  the  system 
provides  the  number  of  tuples  marked. 

2.2.1  ^hsma  definition 

Commands  to  define  the  schema,  structure  of  the  da-^a 
base  and  user  access  capabilities,  make  up  the  Data  Definition 
Language  (DDL)  .  DDL  facilities  are  limi-'red  in  rhis  prototype. 
Data-  base  creation  or  initialization  is  done  by  a  utility.  This 
initializes  all  schema  tables  and  files.  Initialization  also  sets 
up  the  schema  en^  ry  for  the  data  base  administrator  (DB?^)  who  has 
complete  access  +o  the  entire  data  base  and  to  all  commands.  No 
facilities  exist  for  supporting  multiple  data  bases  although  the 
system  may  be  ex”^ ended  for  this  purpose. 

?ill  schema  tables  are  kept  in  core  while  the  user  is 
signed  on  to  MINI7.  This  enables  commands  to  be  executed  with 
greater  efficiency.  However,  should  the  application  program  fail 
to  signoff  due  to  omission  of  a  SIGNO:^?  or  due  to  a  PL/I  error, 
then  the  copy  of  the  schema  on  disc  will  not  be  updated  to 
reflect  the  actual  schema.  To  remedy  this  situation,  the  user 
must  execute  a  recovery  algorithm. 

The  following  sections  describe  the  DDL.  This  includes 
commands  to  and  to  DISTpey  relations.  DESTROY  may  also  be 
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use^  +0  dele'^.e  a  ipark  from  schema.  The  DBA  also  has  available 
a  command  to  crec'*-e  a  schema  entry  for  a  new  user  (CPFATE-USEB)  . 

C  P  E  A"”  E 

■^^his  command  declares  a  relation.  The  system  enters 
relation  information  into  the  PELATTON_TABL E  and  domain 
information  into  the  DOM AIE_table .  The  current  system  also  adds 
the  capabili-^y  of  read  and  write  access  to  the  relation  to  the 
US^P_'T’ABLE  entry  fcr  the  user  creating  the  new  relation. 
Consequently,  this  version  of  the  system  only  allows  the  user  who 
created  the  relation  to  access  it.  The  entries  are  complete 
except  for  index  information  on  the  domains.  The  user  is 
required  to  specify  both  the  logical  information  such  as  the 
domain  names  and  physical  information  such  as  the  types  of 
domains  and  the  maximum  number  of  tuples  allowed  in  a  relation. 

The  system  con-^ains  three  reserved  relations  whose  names 
must  not  be  used  in  CFEATE.  They  store  schema  information  and 
may  be  queried  by  all  users.  The  relations  are  RELAlTION,  DOMAIN, 
and  USEP. 

Once  a  relation  is  CPEATEd,  an  application  program  may 
only  access  it  if  the  user  includes  a  data  set  which  is  to 
contain  the  relation. 

For  each  domain  both  a  domain  name  and  a  domain  type 
must  be  given.  The  types  supported  by  MINIZ  are:  bit,  character, 
fixed  binary,  fixed  decimal,  floating  binary,  floating  decimal 
and  a  special  type  called  date.  As  in  PL/1,  the  precision  may  be 
specified. 


■I^hpre  is  a  special  use  for  the  TT'^  parameter  for  the  CREATE 
command.  specifies  “^he  maximum  number  of  tuples  allowed 
in  the  relation.  '^ach  logical  relation  tuple  is  composed  of  one 
or  more  ''28  byte  physical  •*:uples.  The  system  does  all  blocking 
and  deblocking  of  date. 

DESTROY 

This  command  destroys  a  relation  or  a  marking  entry  in 
rhe  schema.  Only  '^he  logical  definition  of  the  relation  is 
destroyed;  -^he  physical  data  in  the  user's  relation  file  remains 
unchan  ged . 

The  system  does  not  allow  a  relation  to  be  destroyed 
which  still  has  marks  based  upon  some  or  all  of  its  domains. 
Therefore,  the  marks  must  first  be  destroyed.  Furthermore,  no 
relation  may  be  destroyed  unless  other  users  are  'locked  out'. 
Since  marks  are  on  relations,  then  destroying  a  mark  also 
reguires  the  underlying  relation  to  be  locked. 

1  ■f'  y-  rj  s  yp 

This  command  can  only  be  executed  by  the  data  base 
a dmi a istrator  (DB? )  .  The  system  crea-^es  an  entry  in  the  USER 
schema  table  for  the  new  user.  The  DBA  specifies  the  password 
chosen  by  the  user  and  the  system  returns  the  new  user  code. 

Currently  -^his  is  the  only  command  to  manipulate  user 
schema  information.  Commands  for  adding,  changing  or  deleting 
user  capabilities  may  be  added  later. 

The  TTD  parame-t-er  has  a  special  use  in  the  CREATE-USEF 
command.  The  system  assigns  TID(1)  the  code  for  the  new  user. 
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2.2.2  Marking  ZSSililY 

P.t  a  very  low  lev^l,  this  facility  is  useful  for  the 
system  to  take  some  condition  on  the  relation  and  go  through  the 
entire  relation  selecting  tuples  which  satisfy  that  condition. 
This  must  include  search  optimization  for  efficient  support  of 
large  DBMS  applications.  The  programmer  is  relieved  of  doing  his 
own  scan  at  a  higher  level.  Fur-^ hermor e,  it  provides  a  useful 
tool  for  implementing  projections  and  joins,  elementary 
operations  in  Codd*s  relational  model  [Codd  1970-1971,  10-11]. 


MARK 

This  command  determines  which  tuples  of  the  named  user 
relation,  schema  relation  or  mark  satisfy  the  Boolean  tree 
expression  supplied  by  the  user.  The  resulting  tuple  indexes  are 
stored  in  chained  array  segments  known  as  ’basic  elements*.  The 
system  also  en^'-ers  in  -^he  relation  table  +he  schema  information 
describing  each  mark. 

A  MARK  may  be  performed  upon  the  relation  table.  Hence, 
the  user  can  obtain  information  about  marks  and  relations  created 
by  users.  Currently,  the  user  may  only  mark  upon  the  relation, 
PELA.TTON,  with  a  guali  f  icat  ion  of  the  form 
* ^FLATION_NAMF  =  <name  of  relation  or  mark>* .  Since  all  relation 
and  mark  names  are  unique,  then  only  one  tuple  in  RELATION  will 
be  marked. 

To  execu'^.e  a  mark  o-^her  than  one  upon  the  schema,  the 
user  must  lock  the  relation,  which  is  being  marked.  In  the  case  of 
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=  upon  arcther  mark,  the  relation  upon  which  tha  mark  is 
based  mus*''  be  locked. 

""his  command  allows  the  user  to  retrieve,  one  ar  a  time, 
•he  ^uple  ids  stored  by  a  WAFK.  Should  the  user  attempt  a  GFT- 
MAFK  past  -^he  end  of  the  marked  tuples  in  a  mark,  then  an  error 
code  is  returned  with  the  accompanying  message,  'A.TTEMPTED  FETID 
0T7T  OE  BOUNDS  CE  ^.ILOWED  P.ELATION  EXTENT.'.  The  TID 
parameter  is  used  as  follows.  TID(1)  specifies  the  index  into 
rhe  NT^-P.K.  TTD(2)  returns  the  index  of  the  marked  tuple  in  the 
r elat ion . 

2.2.3  Eetrieyal 

The  system  supports  c ne  command  for  retrieval  of  data 
from  relations,  •'■he  GF"’  command.  All  relation  data  is  represented 
by  ^I/T  charac'^er  strings.  Fetrieved  data  is  in  the  form  of 
character  strings.  The  user  is  responsible  for  conversions  in 
the  current  system. 

get 

This  command  enables  a  user  to  retrieve  the  values  of 
any  or  all  domains  in  a  specific  relation  tuple.  Schema 
information  may  also  be  retrieved. 

In  reading  a  relation  the  user  must  be  aware  of  certain 
conventions.  This  version  of  the  system  requires  that  the 
rela-^ion  be  locked  before  GETs  may  be  performed. 
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In  addf-^ion,  the  user  must  realize  that  deleted  tuples 
are  still  preser.'*’  in  "^he  system.  Hence,  a  special  message  code 
indicates  that  a  dummy  *^uple  was  read  and  that  it  should  be 
ignored  (or  a'*'  least,  processed  differently)  .  The  user  may  read 
all  “^.uples  of  a  rela'^ion  by  indexing  through  actual  (not  dummy) 
tuplas  using  a  query  on  the  ?TI?lTTON  schema.  He  may  also  read 
through  an  en-^:re  relation  by  checking  for  the  physical  or 
logical  end  of  file.  Petrievals  may  be  made  upon  three  schema 
relations:  FHLf-'i^TON,  DOMAIN,  and  USER. 

The  DOMAIN  and  *^10  parameters  are  used  as  follows.  The 
DOMAIN  parameter  contains  the  names  of  domains  in  the  relation 
for  which  the  system  re-^urns  the  value  in  the  tuple  specified. 
It  then  returns  values  which  correspond  to  the  respective  domain 
names  passed  in  by  DOMAIN.  Domains  may  be  specified  in  any 
order.  The  TID  parameter  contains  the  index  of  the  tuple  which 
the  user  wants  retrieved. 

2.2.4  Modification  Capabilities 

In  a  general  data  base  system,  commands  for  modification 
of  the  data  base  are  necessary.  These  include  insertion,  deletion 
and  upda-^e  facilities.  In  MINIZ,  all  commands  are  tuple- 
oriented.  The  user  specifies  upon  which  tuple  of  a  relation  to 
perform  the  operation.  However,  since  these  are  commands  which 
modify  data,  the  user  must  LOCK  the  relation  or  relations 


concerned . 


DILET"^ 


"’his  corn.mar.c’  deletes  a  tuple  froir  a  relation.  Deleted 
'uples  are  still  present  in  the  data  base.  When  retrievals  are 
done  on  a  rela-^ion,  -^he  sys-^em  will  return  a  special  value  should 
a  dumipy  (deleted)  tuple  be  read.  At  this  primitive  level,  the 
onus  is  on  +he  user  of  the  system  to  process  dummy  tuples. 
"■■Tr(l)  is  the  index  of  the  tuple  to  be  deleted. 

This  command  inserts  a  tuple  into  a  relation.  No  index 
need  be  specified  by  the  user  since  inserts  are  always  done  at 
the  end  of  a  rela-^ion.  By  using  the  CPEATE  command  in 
combination  with  INSFPT,  the  user  can  set  up  a  data  base.  All 
domains  of  “^he  relation  must  be  given  in  the  DOMAIN. NANE  array 
oarame^er.  Th^^  corresponding  value  of  each  domain  must  be  given 
in  the  respc^ctive  DOMAIN.  INEO  field.  No  conversion  or  checking 
of  data  is  dene  by  this  system.  The  user  must  ensure  that  the 
tuple  inserted  agrees  with  the  schema  definition. 

Tj  p  p  t  T  y 

This  command  updates  one  or  more  domains  of  a  specified 
tuple  in  a  relation.  The  DOMA.IN  parameter  is  specified  just  as 
for  "^-he  INSEP'"  command.  However,  only  those  domains  to  be 
updated  are  specified. 

2.2.5  locking 

When  changes  are  made  to  relations  in  a  multi-user 
environment,  data  integrity  is  always  a  problem.  In  this  system, 
only  a  very  primi-^iv®  locking  mechanism  is  implemented.  The 
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These  commands  lock  or  unlock  one  or  more  relations.  A 
warning  is  given  by  a  special  return  value  if  an  attempt  was  made 
■^o  LOCK  (UNLOCK)  one  or  more  relations  that  were  already  LDCKed 
(UNLOCKed) .  These  are  only  warning  messages  and  do  not  inhibit 
locking  or  unlocking  of  relations. 

2.2.6  Schema  Queries 

In  order  to  find  out  about  relations  or  marks,  the  user 
may  query  any  of  three  schema  relations  using  only  GETs  and 
MAPKs.  These  relations,  already  described  in  section  2.1.2,  are: 
RELATION  which  has  information  on  the  relations  and  marks, 
DOMAIN  which  has  information  on  domains  in  relations, 

USER  which  has  user  information  such  as  user  codes, 

passwords,  and  access  capabilities. 
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2 . 3  H INIZ  Architf ctur f 

The  system  is  structured  as  a  hierarchy  of  PL/I 
procedures  with  ore  monitor  procedure,  MINIZ,  calling  the  various 
utilities  required  to  execute  a  command.  Only  one  user 
application,  written  in  PL/I,  may  access  the  system  at  a  time. 
However,  future  application s  may  be  able  to  use  the  locking 
mechanism  to  support  multiple  users. 

The  system  was  written  in  a  high  level  language  for 
purposes  of  portability,  clarity,  ease  of  programming  and 
debugging.  PL/I  also  has  disc  I/O  facilities.  User  application 
programs  mus*^  also  be  written  in  the  host  language,  PL/1. 

This  section  concentrates  on  -^he  overall  system  structure; 
each  procedural  component  of  nhe  system  is  described.  In 
addition  to  these  system  components,  several  system  disc  files 
are  required.  System  tables  and  lists  are  read  into  core  memory 
when  the  system  is  started  by  a  SIGNON  command.  Before 
terminating  execution,  the  user  executes  a  SIGNOFF  command  to 
write  all  sys-^em  information  back  onto  disc.  Message  and  command 
log  tables  are  also  stored  on  disc. 

2.3.1  System  Components 
MTNIZ 

This  is  the  driving  algorithm  for  the  system  containing 
all  command  procedures.  Its  parameters  were  the  six  mentioned  in 
Section  2.2,  namely:  ACTION,  a  list  of  EFLATIONs  and  DOM\INs, 
tuple  identifiers  (i.e.,  TT’^)  ,  a  possible  condition  pointed  to  by 
QUALIFICATION,  and  some  NFSSAGF  IN^^O.  Basic  utilities  are 
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cor.*:  a  ine;1  wi-^hir.  it  ”c  process  messages  -^o  the  user  (MESSAGE)  ancl 
*:  c  log  all  system  commands  (T.OGSYST)  .  Tn  addition,  MINIZ  ralies 
upon  o^her  external  procedures  for  manipulating  low  level  data 
and  for  providing  schema  inf orma**- ion. 

Before  a  command  is  executed ,  the  SECURITY  procedure 
checks  the  following:  command  access  capability  for  that  user; 
proper  relation  specif  i  ca*^  ion  for  that  command  and  whether  the 
access  required  -^o  the  relation  (s)  is  allowed  for  chat  user. 
Then,  depending  upon  the  ACTION  specified,  the  monitor  invokes 
“he  appropriate  module  within  MINT". 

SYSTEM-TT^BLES 

This  moni-^ors  all  accesses  to  the  schema  tables: 
-■^LATION,  DOMAIN  and  USER.  As  at  the  user  level,  schema  tables 
are  considered  -^o  !«=  relations  and  operations  upon  them  are 
■^uple-oriented.  '^he  ■^ables  are  maintained  by  a  standard  set  of 
modification  commands  and  commands  for  secondary  storage. 

The  table  sizes  place  certain  restrictions  upon  the 
system.  These  sizes  ere  given  to  indicate  the  restrictions. 

?ELATION_TABL^  and  MATN_TAELE  is  255;  USER_TA3LE  is  20. 
Hash  tables  for  these  "^ables  are  211  with  89  slors  for  overflow, 
^or  example,  no  more  than  255  primary  relation  domains  can  exist. 

This  procedure  manipulates  the  unary  relations  used  for 
the  system's  derived  relation  type  (i.e.,  a  mark).  Jnary 
relations  are  implemented  by  [  ZETA  1974,  40; 

Tsichritzis  lo'^n,  38].  These  are  constructed  from  a  header  and  a 
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bo^Y!  as  describ^^d  in  section  2.1.1.  ?.  full  set  of  capabilities 
are  provided  with  which  basic  elements  are  manipulated. 

F^LYlSYSTFM 

The  file  sys^.em  manipulates  user  relations  on  disc.  It 
also  is  responsible  for  backup  and  startup  of  the  system.  The 
file  handling  facilities  provided  are  similar  to  those  provided 
in  the  HNIX  operating  system. 

Fach  user  relation  corresponds  to  a  BDA.M  file  with  128 
byte  physical  records  blocked  to  form  logical  records  (i.e., 
relation  tuples)  .  FTIF-SYSTF?1  handles  blocking  and  deblocking  of 
records.  '^he  PFIA.T  TON^INDFX  or  DOMAIN_INCEX  fields  in  the 
PFLA.TION_T?lBLF  or  POM?lIN_T?lBIF  entries,  respectively,  give  FILE¬ 
SYSTEM  the  required  information. 

2.3.2  F eco ve r Y 

All  schema  information  is  kept  in  core.  Should  the 
user’s  application  program  or  the  system  itself  terminate 
abnormally,  the  copy  of  +he  schema  on  disc  will  not  correctly 
reflect  the  current  schema.  A  special  utility,  RECOVER,  is 
provided  to  re-execute  those  commands  stored  on  the  SYSLOG  file. 
Only  commands  affecting  the  schema  which  were  executed 
successfully  are  re-executed,  namely:  INSERT,  DELETE,  CREATE, 
DESTROY,  LOCK,  UNLOCK,  and  CREATF-USEP .  Since  Boolean  conditions 
are  not  saved  on  SYSLOG,  a  MARK  is  not  re-executed.  A  DBA 
SIGNOFF  is  then  executed  to  write  the  updated  tables  onto  disc. 
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2 . a  ^INIZ  Summarv 

This  chapter  has  described  MINTZ,  the  lowest  level  of 
It  provides  a  single  relation,  single  pass,  tuple-a“^-a- 
“ime  view  of  the  data  base.  The  primary  feature  of  this  level  is 
the  mark  mechanism,  a  unary  pointer  relation,  which  is  used  to 
provide  access  to  subsets  of  data.  Other  features  described  in 
this  chapter  were:  the  basic  architecture,  the  DDL  primitives, 
the  DML  primitives  -^or  retrieval  and  modification  and,  finally, 
the  schema  query  operations. 
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Chapter  3 

Executor:  En term ed late  Level  2.1.1  El§l£B 

•^he  executor  forms  the  intermediate  level  of  the  ZFTA 
system.  Its  major  function  is  to  interpret  the  intermediate  form 
of  •*‘he  DML  commands,  which  were  formed  by  the  high  level  system 
compilers,  in-*-o  basic  commands  for  the  low  level  system,  MINIZ. 
The  executor  drives  WIFTZ,  managing  and  coordinating  the  basic 
level  facilities.  It  also  supports  a  second  level  of  schema 
tables.  These  tables  are  structured  differently  from  the  basic 
system  schema. 

There  are  “^wo  main  features  at  this  level.  The  first 
important  feature  provided  by  the  executor  is  the  ability  to 
deriv<=  new  relations  from  existing  primary  or  derived  relations. 

relations  are  those  relations  created  by  the  DBA  using 
the  DDL.  relations  are  those  created  using  the  DML 
facilities.  The  system  currently  supports  derived  relations 
called  '’snapshots’*.  A  snapshot  is  a  derived  relation  which  is 
independent  of  the  relations  from  which  it  was  created.  When 
created,  it  represents  a  ’’picture”  of  a  portion  of  the  data  base. 
It  is  used  as  a  quick  reference  to  a  subset  of  the  data  base. 
The  second  fea+ure  of  the  executor  is  its  ability  to  support 
multiple  relation,  multiple  pass  queries. 

The  In'*' erpreter  is  the  main  module  in  the  executor.  It 
receives  the  intermediate  form  of  the  DML  commands  which  are 
represented  in  a  set  of  data  structures.  The  interpreter  uses 
the  facilities  and  inform  at  ion  in  the  user  schema  to  execute  the 


coiDmands.  Cowmards  ar^  execu-^ed  by  emitting  commands,  via  a  set 
of  u'^ilities,  o  Y.ZV^Z,  This  chapter  describes  the  data 
structures  which  represent  -the  commands,  the  user  schema  and  its 
or  oce  dure  s ,  -the  interpreter  and,  f  ina  lly  ,  the  utilities  which 
communicate  with  MTNI7. 

3 . 1  Internal  Represent aiion  of  DMI  Commands 

DhL  command  is  represen-ted  by  a  set  of  data  structures 
which  form  the  in-*- erf  ace  between  the  high  level  compilers  and  the 
executor.  These  structures  are  built  by  the  compilers  and  passed 
to  the  executor.  Three  structures,  describing  the  user  and  the 
command  type,  are  passed  as  parameters  in  a  procedure  call  to  the 
executor.  One  further  structure,  the  execution  stack,  has  a 
pointer  to  it  passed  as  a  parameter  in  the  call.  The  execition 
stack  indicates  proper  decomposition  of  the  DML  command. 

This  stack,  in  turn,  is  made  up  of  two  s+ructures ;  the  CONTPOL 
and  the  STlBCCh!M?-Fri .  These  last  two  structures  represent  portions 
of  the  command  by  using  a  final  structure  called  the  Boolean 
condition  tree. 

The  calling  sequence  for  the  executor  is: 

EXEC  (nSE?_TNEO,  COMMAND,  DOMAINS,  EXECUTIO  N_Sr  ACK) 

mE-IEZQ  contains  the  user  identification  code  and  password;  a 
re-^urn  message  code  and  text;  the  number  of  tuples  associated 
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wirh  a  T.ew  relation  in  he  case  of  a  CP.EATi:_SNAPSHOT  command  and 
an  estimated  cos+  of  he  operation. 

COMMAND  con*:ains  an  operation  code  which  names  the  command  and 
the  name  and  type  of  “he  relation  to  be  operated  on  or  created. 

DOMAINS  contains  domain  names  given  by  the  user  and  information 
about  those  domains  and  mailboxes  for  retrieved  results. 

contains  information  about  the  sequence  of 
execution  of  a  DMI  command.  It  is  composed  of  linked  CONTROL  and 
SUBCOMMAND  structures. 


coNT;?gL 

This  s-t-ructure  is  required  whenever  the  command  has  an 
associated  where  clause  --the  search  qualification.  One  CONTROL 
struc-^ure  is  required  by  UPDATE  and  DELETE  while  as  many  as  two 
are  required  for  a  CFFA'^E.  The  CONTROL  structure  gives  detailed 
information  about  the  DML  command  and  the  various  types  of 
options  specified.  CONTROL  contains  the  names  of  the  relation 
and  domains  involved  plus  a  pointer  to  '•'he  where  clause. 


SUBCOMMAND 

A  SUBCOMMAND  structure  is  required  whenever  an  implicit 
join  operation  (i.e.,  a  multi-relation  qualification)  is  required 
by  the  where  clause  of  the  CONTROL  structure  or  in  the 
computation  of  a  value  for  a  domain  in  the  INSERT  and  UPDATE 
commands.  It  gives  information  about  the  implicit  join 
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O'Deri  t  ion .  SUPCOTM^lN?  con-*-ains  the  names  of  the  relation  and 
domains  involved,  a  location  to  store  the  result  and  any 
arinhmetic  function  needed. 

This  condi“^ion  tree  is  made  up  of  two  types  of 
structures,  the  Poolean  CPF'®2T0?s  which  form  the  nonleaf  nodes  of 
a  tree  and  the  LP^Ps  of  the  tree  which  f orm  the  test  on  the 
domains. 


OPPF.-.TOP 

tn  0PPF2T0P  structure  is  allocated  for  each  Boolean 
connector  found  in  the  DML  command.  This  structure  has  two  links 
representing  the  two  sides  of  the  binary  operation.  Pach  side 
could  be  a  subtree  of  OPPF.ATOPS  and  LPA'P's  or  a  single  LEAF 
structure.  ""he  operator  node  contains  the  Boolean  operator,  AND 
or  DP,  plus  left  and  right  subtree  pointers. 

L^AP 

A  LPAP  or  terminal  node  of  the  tree  is  allocated  for 
each  occurrence  of  a  mark  name  or  a  domain  name,  comparison 
operator  and  value  triple  which  forms  a  condition  in  the  DML 
c  ommand . 

The  comparison  operators  allowed  are: 

=  ,  -«=,  <,  <=,  >,  >=  ,  =f'=,  -!=«'=. 

*=  -  stands  for  th<=  set  comparison  operator  IS  ONP  OF. 

-  stands  for  IS  NOT  ONP  OP. 
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3 . 2  The  §£Lf;I!I§. 

T^nother  major  feature  of  the  executor  design  is  that  it 
supports  a  user  schema  table  built  from  the  basic  level  system 
schema  information.  One  objec'^ive  of  this  schema  is  to  restrict 
the  user  •^o  information  pertinent  to  himself.  Secondly,  in  a 
multi-user  environment  where  the  basic  level  system  schema  will 
have  to  be  protec^pd  by  a  monitor,  a  user  schema  will  minimize 
the  conges“^ion  +ha^  might  occur  at  that  level.  Its  main 
objective  is  to  provide  the  necessary  information  to  allow  the 
construction  of  various  types  of  derived  relations. 

The  user  schema  is  built  dynamically  as  the  need  for  the 
information  arises.  "therefore,  when  the  user  signs  on  to  the 
system,  there  are  no  entries  in  the  schema  table.  An  entry  is 
created  whenever  the  user  makes  reference  to  a  relation  not  yet 
presant.  Fach  entry  in  the  user  schema  requires  a  number  of 
schema  queries  to  the  basic  level  and  some  processing  of  the 
results.  The  informa-^ion  is  kept  as  long  as  possible  in  the 
corresponding  table,  ^o  minimize  the  number  of  disk  accesses, 
until  it  must  be  swapped  out  of  the  table.  A  simple,  least 
recently  used,  swappirg  algorithm  is  employed.  Currently,  a  user 
schema  is  not  kept  permanently  in  the  system.  All  entries  are 
destroyed  as  soon  as  the  user  signs  off. 

Unlike  ■‘■he  basic  level  system  schema,  the  user  schema  is 
much  smaller  in  size  because  it  contains  only  information 
relevant  to  the  executor's  operation.  For  a  primary  relation  the 
user  schema  provides  -^he  information  on  all  the  markings  built  on 
that  file.  Similarly,  for  a  snapshot  relation  implemented  as  a 
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marVing,  -^his  schema  provides  information  on  its  primary  file  and 
o'her  markings  based  on  the  same  file.  The  user  schema  provides 
fast  access  •‘■o  the  Information  essential  to  the  correct  use  of 
rhe  low  level  sys  +  eni. 

'^our  “^aMes  make  up  the  user  schema:  the  HAlSH  table,  the 
P'?I^’aPY_INFO  -^able,  ^-he  MA.PF_INFO  table  and  the  DOMA-IN_INFO 
structure . 

HASH  lable 

This  hash  and  sca-^ter  table  provides  fas-  access  to  the 
other  two  tables  by  the  relation  name.  The  table  has  three 
fields:  name,  relation  “^yp*?  and  a  pointer  to  MARK_INFO  or 
PPI HAPY_INFO. 

A.n  entry  in  this  table  corresponds  to  a  primary 
relation.  Fach  entry  has  the  following  information:  a  pointer 
to  a  list  of  domains  involved;  the  number  of  markings  built  on 
this  primary;  a  counter  for  use  in  the  GET_NEXT  command;  a 
counter  to  keep  track  of  when  the  entry  was  last  referenced;  an 
index  to  the  first  associated  marking  (MAPK_INFO)  and  an  index  to 
th“  HA.SH  table  entry. 

HELE-IEZQ  I§ble 

A.n  entry  in  this  table  corresponds  to  a  relatian  that  is 
implemented  as  a  marking.  The  -^able  contains  the  following 
information:  an  index  to  the  primary  relation  on  which  the 
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marking  is  based;  the  domains  involved;  a  counter  used  in  the 
G'FT_Nf3’XT  command;  an  index  into  the  next  marking  with  the  same 
primary  and  an  index  ‘*■0  the  HASH  table  entry.  A  marking  chain  is 
imbedded  in  this  ^able.  This  chain  facilitates  fast  access  to 
all  +:he  markings  associated  with  a  primary  relation. 

D0M^IN_TN?0  Structure 

This  structure  describes  the  domains  of  a  relatian  by 
name,  type,  width  and  key  attributes.  These  descriptions, 
pointed  +0  from  HAPK_IHFO  and  PPTMAF Y_INPO ,  can  be  shared  by 
several  relations. 

3.3  The  Join  Operations 

There  are  two  types  of  join  operations  in  the  DML 
syntax;  the  implicit  and  +he  explicit  joins.  The  difference 
be-^ween  the  two  +ypes  of  join  is  that  an  explicit  join  results  in 
a  uermanent  new  relation.  Implicit  joins,  on  the  other  hand,  are 
merely  tools  for  performing  restrictions  as  single  steps  within  a 
multiple  relation  where  clause;  the  results  are  not  kept  after 
the  step. 

An  implicit  join  occurs  when  the  user  employs  the 
selection  results  from  one  relation  as  the  selection  criteria  of 
another.  It  is  a  form  of  restriction.  Markings  are  extremely 
useful  in  performing  a  seguence  of  implicit  join  operations, 
since  -^he  underlying  files  are  locked  and  the  markings  created  on 
•'■hem  are  destroyed  as  soon  as  the  results  are  used. 

•^he  explicit  join  provided  by  the  DML  is  a  natural  join. 
It  involves  -^wo  rela-*-ions  each  of  which  may  be  the  result  of  a 
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sele::tion  described  by  a  where  clause.  '^he  natural  join  of  P  and 
S  “Ls ! 

F’i'S  =  {(a,b,c)  :  (a,b)  is  in  F  and  (b,c)  is  in  S  } 

In  i-^ially,  the  rxecu-^or  will  employ  a  straight  forward  algorithm. 
'=’irst,  the  where  claus‘="s  are  applied  to  F  and  S  producing  two 
markings  F1  and  S1.  Then,  each  tuple  of  FI  is  compared  ta  all 
tuples  of  S1.  The  resulting  tuples  are  stored  in  a  primary 
relation.  Later  versions  will  exploit  inverted  lists. 

3  .  U  ;^he  locking  ^ichanis m 

The  basic  level  system  provides  a  primitive  locking 
me  chan  ism  for  an  er.'^ire  primary  relation.  Initially,  the  high 
level  uses  the  locking  only  to  open  and  to  close  files. 

3.5  The  Ixocutor  Im2i§!I!.f 

The  architecture  of  the  execuror  consists  of  three  major 
components;  the  interpreter,  a  set  of  schema  procedures,  and  a 
se*-,  of  utilities.  PAl  three  components  are  in  a  single  external 
procedure  to  which  “^he  schema  tables  are  global. 

One  characteristic  of  the  procedures  in  the  executor  is 
its  hierarchical  organization.  A  complicated  operation  is 
handled  by  breaking  it  down  in**:  o  simpler  sub- oper  at  ions  at  each 
level  of  the  procedure  hierarchy.  This  organization  also  allows 
the  sharing  of  basic  procedures . 
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3.5.1  The 

""he  in-^erpreter  is  the  nucleus  of  the  executor.  Its 
major  function  is  to  execute  +he  DF^L  commands  using  the  available 
lov  level  facilities.  Its  procedures  are  organized  in  a 
hi==ra rchical  tree,  the  roo*^  of  which  is  the  driver  procedure. 
Dis-^inct  commands  are  handled  by  different  procedures  at  the 
second  level  of  -^he  hierarchy.  Fach  second  level  procedure  has 
its  own  subtree  to  process  the  command.  A  brief  description  of 
the  major  components  in  this  hierarchy  follows. 


DPIVFP 

The  DPIVFP.  is  responsible  for  the  coordination  of  the 
following  nine  procedures: 

signs  on,  or  off  •♦■he  low  level  system  and 
initiates  the  global  user  schema  and  other  static  variables. 

does  the  semantic  checks.  (These  will  be  done  by  the 
DML  compilers  when  they  are  implemented.) 

2SZ2Z  determines  whether  or  not  the  relation 
involved  in  a  modification  command  is  implemented  as  a 
primary  rela-^-ion. 

4.  locks/unlocks  the  appropriate  primary  relations. 

creates  a  snapshot.  The  DML  com.piler  specifies 
whe-^her  the  rela-^-ion  is  to  be  a  primary  or  a  mark, 
procedures  are: 


The  main 
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^-.1  resolves  an  join  of  a  where  clause. 

‘^•2  generates  a  marking. 

^•2  20EYm  converts  a  marking  into  a  primary  relation.  In 
addition  to  th'^  utilities  nFSTP.OY-EFLATION ,  GET-I  and 
INSFP'^  it  uses  CPFAIF-PPIf^APY  which  defines  a  primary 
rela-^ion . 

5.4  ndds  a  new  relation  to  the  schema  table. 

calls  SUBCOnMAFD  to  resolve  the  where  clause 
and  then  calls  *he  basic  level  MODIFY  command. 

•  2NSFP;r  calls  SUBCOMMAND  and  then  the  basic  level  INSFPT 
command . 

8.  bFSTPO Y_P FLATION  destroys  either  a  primary  relation  and  all 
markings  built  on  it  or  just  a  marking  alone.  There  are 
three  reasons  for  destroying  associated  markings 
automatically.  First,  markings  are  meaningless  when  their 
associated  primary  relation  is  destroyed.  Second,  if  the 
user  has  the  authority  to  destroy  the  primary  file,  he  must 
have  the  authority  to  destroy  all  associated  markings. 
Third,  the  low  level  system  does  no  checking  on  the  possible 
error  that  migh-^  result  after  destroying  the  relation  and 
leaving  the  markings. 

2*  -he  PFSFT  procedure  sets  the  counter  of  the 

relation  entry  in  the  user  schema  to  the  beginning  of  the 
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logical  fil«='.  GETNEXT  gets  the  next  tuple  from  either  a 
marking  or  a  primary  relation. 

3.5.2  User  Schema  Procedures 

This  sp”^  of  procedures  is  responsible  for  building  and 
maintaining  the  user  schema,  dynamically.  To  find  information 
about  a  relation,  ^he  user  may  call  the  ACCESS  procedure.  A-n 
index  in-^o  the  HASH  table  will  be  returned  if  the  relation  exists 
in  the  schema  of  the  low  level  system.  Otherwise,  a  message  will 
be  returned.  Similar  to  the  interpreter,  the  user  schema 
procedures  are  organized  in  a  hierarchical  fashion.  The  A-TCESS 
procedure  is  the  parent  of  the  hierarchy  and  is  responsible  for 
building  the  user  schema.  AiCCESS  is  called  by  routines  in  the 
interpreter  when  information  associated  with  rhe  relation  is 
required.  The  following  is  a  brief  outline  of  the  user  schema 
proced  ures . 

determines  whether  or  not  the  relation  is  already  in  the 
user  schema  by  calling  P-HA.SH.  If  this  is  the  case,  the 
appropriate  user  schema  index  will  be  returned.  Otherwise, 
it  will  try  "^o  find  the  information  from  the  system  schema  by 
calling  the  FILL  procedure. 

uses  the  HASH  table  to  return  an  index  to  MARK_INFO  or 
PP.IMARY_IN?0. 

EllL  finds  rela+ion  information  from  the  basic  level  schema 
tables  and  puts  thar  information  into  the  user  schema  tables. 
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re^^rieves  froip  the  basic  level  the  type,  keyed,  and 
domain  -^i^lds  associated  with  the  relation  entry. 

E.ZZZBrZl  puts  the  Information  reprieved  by  PEL?iTION_INFO  into  a 
PRTMAFY_IN?0  entry  if  the  relation  is  a  primary. 

finds  an  empty  slot  in  the  PP, IMA.RY_INFO  or  MA.PK_INFO  tables 
for  the  rela'^ion  en-^ry  or  swaps  an  old  one  out. 

puts  an  entry  in  i^^*PK_INFO  if  the  relation  is  a  marking. 

3.5.3  Utilities 

These  routines  are  designed  for  use  by  many  procedures. 
They  are  the  building  blocks  for  both  the  interpreter  and  -^he 
schema  procedures.  For  each  low  level  command,  there  is  a 
utility  routine  that  handles  the  generation  of  that  command.  A 
brief  description  of  some  utilities  follows. 

retrieves  the  i-th  tuple  of  a  primary  or  a  mark. 

Uf STPOY^PFI^TION  destroys  a  relation. 

CPFBtIIlUEZEZINSFPT^ON^OFF/  UPDATE/PELETF/GET^MARK  generate  the 
respective  basic  level  commands. 

debugging  and  Error  Handling  Facilities 

A  comprehensive  tracing  faciltity  which  prints  input, 
output  and  intermedia+e  values  of  major  sections  has  been 
implemented  in  “^he  sys-^em.  This  tracing  mechanism  is  controlled 
by  three  bits.  Each  bi-^,  when  switched  on,  reguests  a  trace  of  a 
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particular  section  of  the  executor.  In  particular,  one  bit  is 
for  the  interpreter  and  ano-^her  is  for  the  schema  procedures. 
The  remaining  bi”^  controls  the  prinring  of  the  input  DML  commands 
and  the  output  low  level  commands. 

The  error  routine  F?,P.05_mESS?lGE  is  called  whenever  an 
“rror  is  detected.  T^  accep-^s  a  single  parameter  --the  error 
message  number--  end  prints  the  corresponding  error  message  while 
turning  on  a  system  global  bi^  to  indicate  that  an  error  has 
occurred . 

Summary 

This  chapter  has  described  the  EXECUTOR,  the 
intermediate  level  of  ZETA.  The  executor  accepts  relation 
oriented  commands  from  t^e  language  facilities  level  and  executes 
them  using  the  tu pie- oriented  primitives  of  MINIZ.  The  data 
structure  language  interface  with  the  language  level  was 
d escribed  as  was  •*-he  process  of  command  analysis  and  execution. 
Finally,  the  EXECHTOP  implementation  was  outlined.- 

'^he  main  features  of  the  EXECUTOR  are  the  multiple 
relation  view  provided  and  the  ability  to  derive  new  relations 
using  snapshots. 
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Chapter  ^ 

L§.L2.12§:2.§.  ^  Hisll  L§Y.§.1  Zi.^^ 

The  language  level  uses  the  intermediate  level 
Drimitives  to  provide  the  highest  logical  level  of  ZETA-.  It 
provides  two  modes  of  access:  a  programmable  interface  throjgh  a 
host  programming  language  and  a  self-contained  problem  oriented 
interface  through  a  query  language  generating  system. 

^  h  h  22§^.  facility 

Communication  with  a  data  base  can  be  divided  into  two 
parts:  the  data  Isfinilicn  (DDL)  and  the  data 

manipulation  language  (Df^L)  .  The  DDL  is  used  to  describe  the 
data  in  logical  '^erms  (e.g.,  names,  associations)  and  in  physical 
terms  (e.g.,  representations,  access  methods).  These 

descriptions  are  -^emulates  which  the  DBMS  uses  to  maintain  data 
occurrences.  The  TML  is  used  to  access  and  direct  the 
manipulative  facilities  of  the  DBMS  such  as  store,  delete, 
insert,  modify  and  retrieve. 

A  data  base  communication  facility  can  be  provided  in 
two  types  of  systems:  a  host  programming  language  system  (HLS) 
and  a  self-contained  language  system  (SLS) .  In  an  HLS,  the  DML 
and  DDL  are  embedded,  in  the  host  programming  language  by  way  of 
procedure  calls  to  the  DBMS.  An  HLS  is  procedural  in  that  a  user 
must  state  how  an  operation  is  to  be  done.  An  SLS  is  an 
independent  language  facility  with  a  complete  set  of  DBMS 
communication  capabilities.  T"^  is  non-procedural  in  that  the 
user  states  only  wha*  is  to  be  done. 


Both  an  SIS  and  an  HIS  have  distinct  advantages.  An.  HIS 
gives  users  the  extended  capabilities  of  the  host  programming 
language  to  store  in  termedia te  results  and  generated  DML 
commands.  This  extends  the  use  of  the  PBWS  to  create  complex 
queries  beyond  ••■he  capabilities  of  the  D^L  to  produce  tailored 
reports  or  to  compute  special  functions.  An  HIS  is  as  flexible 
as  the  hos*  language  it  uses.  However,  an  SIS  is  often  designed 
for  casual  or  non-programmer  users  as  it  can  have  an  English- like 
syntax  and  may  be  oriented  to  a  specific  application  which  is 
familiar  to  the  user.  An  SIS  economically  handles  ad-hoc,  one¬ 
time  queries  since  program  development  time  is  reduced.  Two 
drawbacks  of  the  SIS  approach  are  tha“^  an  SIS  has  fixed  and 
limited  capabili*!- ies  and  that  it  is  a  major  task  to  build  an  SIS. 

ZFTA  a em p "t- s  to  provide  access  to  both  an  SIS  and  an 
HIS.  The  HIS  is  described  in  the  next  section  (4.2) .  The  SIS  is 
provi  ded  through  a  guery  language,  generator  system  (QIS)  which  is 
described  in  the  subsequent  section  (4.3)  .  "^he  QIS  is  a  compiler 
generator  which  facilitates  the  production  of  an  SIS. 

The  SIS  compilers,  produced  by  rhe  QIS,  exist  at  the 
same  level  as  the  PI/1  compiler  in  the  HIS.  The  PI/1  compiler 
produces  object  code  that  contains  calls  to  external  data  base 
procedures.  Tn  •^urn,  the  DMI  compiler  translates  the  commands 
into  the  data  structure  language  of  the  EXECUTOR  which  is  called 
to  perform  the  operation.  The  SIS  compiler  could  produce  the 
same  object  code.  "^his  is  the  simplest  procedure;  however,  it 
requires  the  SIS  compiler  to  duplicate  some  tasks  of  the  DMI 
compiler.  Tn  ■*:he  SIS  compilers  converse  directly  with  the 
EXECUTOR  in  the  data  structure  language  for  the  sake  of 


‘=fficiency.  "’hir  arranaempr.":  provides  a  stable  basis  for  ths  HLS 
ar.d  allows  the  SI?  to  change  without  impacting  the  HLS. 
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4.2  A  Pelational  Host  El22I§:i!ll=I13  L§il22§32  §I§l2I!l  (HLS) 

This  facility  provides  access  to  the  data  base  from  a 
host  programming  language  (PI/1  in  this  case)  via  a  relational 
PHI  sublanguage.  This  communication  is  implemented  using  two 
components:  a  preprocessor  and  the  DHL  compiler.  A.n  application 
program  passes  -through  the  preprocessor  which  replaces  the  DML 
statements  with  PL/1  procedure  calls  to  rhe  DHL  compiler  plus 
other  related  statements.  The  resulting  PL/1  code  is  compiled  in 
the  normal  fashion  and  linked  with  the  DHL  compiler.  On 
execution,  the  DHL  compiler  accepts  procedure  calls  which  contain 
the  DHL  commands  in  character  form.  The  commands  are  analyzed 
and  “^he  appropriate  data  structures  are  generated  for  the 
executor. 

4.2.1  The  Relational  DHI  Sublanguage 

The  sublanguage  provides  a  data  qualification  capability 
similar  to  that  of  SFQU'P’I  [Chamberlin  and  Boyce  1974,  5]  for 
specifying  subsets  of  the  data  base  for  retrieval  and 
modification.  Interactions  with  host  program  variables  can  be 
used  to  hold  temporary  results  and  to  generate  dynamically 
created  requests  for  complicated  results  and  queries  formulated 
at  execution  time.  Two  commands,  GET^NEXT  and  INSERT,  operate  on 
one  tuple  at  a  time;  •'■he  remaining  commands  refer  to  sets  of 
tuples.  Retrievals  are  accomplished  by  first  specifying  and 
creating  a  ’’snapshot"  which  is  a  sophisticated  implemen'ca tion  of 
a  workspace.  Then,  GET_NEXT  is  used  to  retrieve  data  values  from 
the  snapshot . 


'h0  following  metalanguage  symbols  are  used: 


Angular  ^rackets  ^  an^  >  denote  synratic  entities  or  phrases. 

O?_symbol  |  denotes  a  choice  among  syntactic  entities  or 
k  eywords . 

^  1  indicate  an  optional  string. 

}  factor  out  common  entities  or  keywords. 

^lis t _cf _^_clause  <list  of  {  }  >  denotes  one  or  more  repetitions, 
seprarated  by  commas,  of  the  entities  inside  the  braces. 

Da*a 

I^2_Hi!IZI«Clause 

The  where  clause  is  the  basic  restriction  facility.  It 
enables  the  user  to  choose  a  subset  of  the  tuples  of  a  relation 
based  on  Boolean  combinations  of  conditions  on  domain  values, 
'^he  conditions  may  involve  a  comparison  on  a  single  value  or  a 
res-'-riction  of  a  domain  to  be  one  of  a  set  of  values.  This  set 
of  values  may  be  lis":ed  explici-^ly  or  it  may  be  stated  as  an 
implicit  join.  The  gua li f ication  of  one  domain  in  a  where  clause 
by  another  where  clause  or  an  arithmetic  function  acting  on 
another  relation  is  called  an  irnplicit  join. 

The  where  clause  syntax: 

<where  clause>  ::=  <Bcolean  term> 

I  <where  clause>  OP  <Boolean  term> 

<■^00190?.  •:erm>  ::=  ^Bool^an  factor> 

KPcolean  term>  AND  <Boolean  factor> 


<Poolean  factor>  :  :=  <con di-t- ion>  1  (<wh0re  clause>) 

<condit.iori>  ::=  <sinqlG  valiie>  <rel_op>  <single  vaiuG> 

I  <don'ain  namG>  IS  [NOT]  ONE  OF  (<SGt  of  valiiGS>) 

<rGl_op>  :  :=  >=  1  -»=  I  <=  I  =  I  <  I  > 

<single  valuG>  <da‘^a  valuG> 

I  <built _in_f un>  (<SGt  of  values>) 

I  <HPL  variable> 

1  < domain  namG> 

<biiilt_in_fun>  ::=  TOTI.L  |  AYG  1  MAX  |  MIN  \  COUNT 

<sGt  of  valuGs>  : :=  <SGt  tGrm>  1  <set  of  valuGS>  UNION  <set  tGrm> 

<SGt  tGrm>  ::=  <sg+  factor>  |  <SGt  tGrm>  INTEPSECT  <set  factor> 

<sGf  factor>  <SGt  primary>  1  <set  factor>  EXCLUDE  <set  primary> 

<sGt  primary>  ::  =  <sGt>  I  (<SGt  of  valiiGs>) 

<SGt>  :;=  <list  of  {  <singlG  valuG>  }> 

I  rUNIQU"^]  <rGlation  namG>  <domain  ria!nG> 

r WHERE  <where  clausG>] 

Iki  UNIQUE  Option 

PGlational  operations  which  strike  out  attributes  may 
leave  duplicate  •’-uples  in  the  resulting  relation.  The  UNIQUE 
option  effectively  strikes  out  these  duplicate  tuples. 

Stipshot  Definition  Facility 

'’"he  snapshot  facility  allows  the  derivation  and  naming 
of  a  relation  which  is  the  result  of  a  qualification  of  one 
relation  or  the  join  of  two  relations.  Snapshots  do  not  reflect 
any  changes  to  the  data  base  after  their  creation.  They  can  be 
used  for  retrieval;  however,  they  cannot  be  modified.  When  they 
are  no  longer  needed  or  are  not  sufficiently  current,  snapshots 
can  be  destroyed.  Snapshots  facilitate  derived  relations  which 
generally  have  a  restricted  amount  of  data  in  them. 


This  last 
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makes  the  sr.apsho*-  the  chief  tool  for  retrieval  in  the  DML. 

By  forcing  subsegnent  retrievals  through  a  snapshot,  the  system 
is  free  to  update  primary  relations  without  a  host  program 
holding  up  the  data  base  with  excessive  read-only  locks. 

The  C?rAT'='/PFSTBC'Y  SFAPSHOT  syntax  is: 

CBFATT-SM?lP?FOT-PFL  <relation  name> 

FPOM  rnriouP]  <subrelation  speci fica t ion> 
r  < join  clause>  ] 
r  <scrt  clause>  ] 

rPSTPOY-PTJ?.PSROT  <relation  name> 

<'subr elation  specif ication>  :  :=  <relation  name> 

[  <domain  list>  ] 

[  WHPRE  <where  clause>  ] 

<jcin  clause>  ::=  JOTF  <subrelation  specif ication> 

BY  MATCHING  <list  of  {<relation  nam0>  <do(nain  name> 

WITH  <relation  name>  <domain  name>  }> 

<sort  clause>  ::=  SORT  <domain  name>  {  UP j DOWN  } 

r<lis-  of  {  [  THEN  BY  ]  <domain  name>  {  UP|DOWN  }  }>  ] 

<dom.ain  lis'^>  ::=  <lis‘^  of  {<dcmain  name>  [  RENAMED  <name>  ]  }> 

'^he  following  relations  will  be  used  to  illustrate 
the  DML  operations: 

STUDENT  (SNAME,  AD  DR,  DEP'T',  CREDIT) 

COURSE  (PROE,  COUPSE-NO) 

CUPPICUIUri  (DEPT,  COUPSE-NO) 

STAF'^  (PROF,  SALARY,  HOURS,  POSITION,  CHAIRMAN) 


The  followirg  two  examples  illustrate  several  features: 
a  <lomair.  lis'^>  providing  a  projection;  a  <where  clause> 
providing  selec-^-ion;  nested  where  clauses  providing  implicit 
joins  and  an  exp3 icit  ioin. 

Example:  Define  a  snapshot  called  consisting  of  the 

professors  who  teach  math  courses. 

FATH-PPO? 

"fpow  ”5onpsE 

PPO? 

WFFPE  COUPSF-NO  Z1  QlliZ  2Z  ( 

CnppicULUW  COUESP-NO 
WHEPP  DEPT  =  *  MA.TH* ) 

Example:  Define  a  snapshot  called  VISITOP  which  contains  the 

names,  course  numbers  and  departments  of  visiting  professors 
who  make  more  than  $15CP0.  and  do  not  teach  in  the  economics 
department , 


CPEATE^SNAPSFOT^PPL  VISITOP. 

EEOM  *  COFPSE 

PPOE,  COUPSE-NO 
WHEPE  PPC'p’  IS  ONE  OE  (UNI^HI 
STAEE 
PPOE 

WHEPE  POSITION  =  ’VISITOP* 
AND 

SALAPY  >  15000) 

JOIN  CUPPICULUM 

”  DEPT 

WHEPE  DEPT  -.=  ’ECONOMICS' 

BY  MATCHING  COUPSE  COUPSE-NO 
Him  CUPPICULUM  COUPSE-NO 


tlf 

^^^rifval  and  Associated 

QTm  ^  "J^yT  ap  3 

GrT_'NEXT  1-^0 V0C;  4-]^0  next  t^p2.e  of  the  primary  or 

derived  relation  :ndicatp(5  j-y  one  of  several  implicit  pointers  or 
currency  indicators.  The  STCPF  phrase  puts  the  values  of  named 
attributes  : nto  named  variables  in  ^ho  host  language  program. 

Host  program  variables  and  currency  indicators  may  be  used  as 
free  variables  to  store  the  results  of  one  retrieval  for  later 
use  winh  a  second  retrieval.  The  PESET  command  places  the 
GET_NEXT  indicators  at  the  physically  first  tuple  in  the 
relation.  The  syntax  for  GE'r^HEXT  and  PESET  is: 

GPT_NEX'^  rUJ^lQHP]  <relation  name>[  (<currency  indicator  id>)  ] 
STCF.E  <list  of  {<domain  name>  INTO  <HP  variabl3>*> 
roN  END-C-PEKHPL  statement>] 

PESET  <rela^ion  name>  [  (<currency  indicator  id>)  ] 

<currency  indicator  id>  : :=  <integer> 

Example:  Petrieve  the  next  tuple  in  the  STUDENT-LIST  snapshot. 

GET_NEXT  STrDENT_LIST 

STOPE  SNANE  IN^O  H'^VI  , 

ADD?  INTO  HPV2 

QEIE  aHi  CLOSE 

In  order  to  be  accessed,  relations  -must  be  opened.  This 
resets  currency  indicators  and  locks  relations  for  read-only 
access.  To  avoid  a  deadlock  over  relations,  open  commands  may 
not  be  nested;  however,  relations  may  be  closed  separately.  The 
syntax  is: 

r  OPEN  1  CLOSE  1  <lis+  of  {  <relation  nams>  }> 
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Iv.  the  named  relation,  all  -^uples  which  satisfy  the  where 
clause  have  -^he  named  domains  updated  to  the  given  single  value. 
The  syntax  for  UPDATE  is: 

TTPDP/"""  <rela^-ion> 

<list  of  {  <domain>  TO  <single  value>  }> 
rwPFPE  ^where  clause>  ] 

Example:  Update  the  department  chairman’s  salary  in  the  STAFF 

rela-^ion  to  the  maximum  salary  found  in  that  relation. 

UPDATE  STAFF 

SAIA.PY  ^0  MAX  (STAFF  SAIA.P.Y) 

WHFP^  POSITION  =  ’CHAIPMAN* 


tnSFPT 

The  tuple  given  by  the  list  of  domain-value  pairs  is  added  to 
the  named  relation.  The  syn.'^ax  for  INSEPT  is: 

INSFPT_INTO  ^relation  name> 

<list  of  {  <domain>  WI'i^H  <single  value>  }  > 

DFLFTF 

All  tuples  which  satisfy  the  where  clause  are  deleted  from 
the  named  relation.  The  syn+ax  for  DELETE  is: 

DFLFTF  FPOM  <rela-*-ion  name> 


[UHFPF  <where  clause>] 


tip 
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A  completicr.  code  value  indicates  the  success  or  degree 
of  failure  of  a  command.  the  end  of  each  command  the  sysrera 

passes  an  integer  and  a  ■‘■extual  completion  code  to  prearranged 
host  programming  language  variables.  These  variables  must  be  set 
up  at  the  beginning  of  each  application  by  means  of  the  following 
two  D^L  statements: 

CO*^PLETTON_COrr  IS  <HPI  variable> 

COMPLETION_r^FSSAGP  IS  <HPI  variable> 

On  Conditions  For  Frror  Handling 

Unaccep-^  able  conditions  may  be  detected  by  the 
comple-^-ion  code  value  or  by  Boolean  combinations  of  simple 
conditions.  These  conditions  can  be  used  no  stop  processing  and 
to  execute  excep-^  ion  handling  code.  The  on-condition  sxatement 
has  the  syntax: 

OF  <condition  list>  DO  <HPL  statement  list> 
where  <condition  list>  is  a  Boolean  combination  of: 

<HPI  variable>  <rel_op>  <integer> 

TJSFF  I  DFNTIF^CATIOF/HS  FF 

The  DBA  uses  both  commands  to  identify  a  user  to  the 
system  and  then  to  give  the  user  a  password.  Thereafter,  the 
password  must  be  given  at  +he  beginning  of  a  program.  PASSWORD 
may  also  be  used  by  the  user  to  change  the  password. 

USFF  IS  <HPI  variable> 

PASSVOF^'  IS  <HPL  variable> 


U.2.2  The  Preprocessor  and  The  DWL  Compiler 


The  mair.  advantage  of  the  preprocessor  is  rhat  it 
provides  protection  by  restricting  D!^L  and  HPI  features  and  by 
verifying  the  syn-^ax.  A  preprocessor  could  do  the  semantic 

checking  and  generate  data  s-^ructures  for  the  executor;  however, 
this  would  introduce  a  time  delay  between  translation  and 
execution.  Hence,  the  preprocessor  checks  syntax  and  generates  a 

4 

procedure  call  form  o-^  the  DHL  commands.  At  execution  time,  the 
PHL  compiler  analyxes  the  DHL  commands.  After  making  the 
syntactic  and  semantic  checks,  "^he  compiler  generates  the  data 
structures  for  the  EXFCTjtoF,  with  no  significant  time  delay. 

The  method  of  analysis  and  data  structure  generation 
deserves  mention.  Hulti-pass,  mult  i- r<='lation  qualifications  are 
broken  down  in-’-o  a  sequence  of  single  pass,  single  relation 
qualifications  called  blocks.  As  they  are  encountered,  the 
blocks  are  put  onto  a  LT^O  stack  called  the  execution  stack.  A 
block  corresponds  logically  to  a  single  where  or  join  clause. 
Physically,  blocks  are  stored  in  the  CONTROL,  SUBCOMMAND  and 
Boolean  condition  tree  structures  described  in  chapter  3. 

The  following  example  illustrates  the  block  analysis; 

STITDENT_FTLF  (NAME,  ADDRFSS,  TOTAL_CP.FDI  T) 

GPADUATE^FILF  (NAME,  PROGRAM) 

ABSENT_FIIF  (NAME) 

We  wan+  to  find  those  Ph.D.  students  who  have  been 
absent  for  more  than  two  years  and  have  earned  less  than  the 
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=  TOTAI_CP^riT  in  the  STUPF^^T^'pIIF  .  The  DKL  command  which 

creates  this  lis-*-  in  a  snapshot  is: 

CFFJITF  SF?^PSH0T  P.FL  SPFCTAL  STUPFNT 

fpop  studfnt'file 

NAME,  ADDPFSS 
WHF^F  NAME  ONE  OF 
(OPADPATE_FILE 
FAME 

WH^PE  PPOGPAM  =  *PHD' 

AND 

NAME  IS  ONE  OF 
(APSENT_FILE” 

NAME)  ) 

AND 

TOTAL_CPEDIT  <  AVG 

(STUDENT_FTLE 

TOTA.L_CPEDIT) 


The  above  command  has  four  nested  blocks,  namely: 

Block  1:  PPOw  STnDFNT_FILF 

NAME,  J^.DDPESS 
WprPF  NAME  IS  ONE  OF 

(Block2  (Blocks)) 

AND 

TOTAI_CFEDrT  <  A*VG  (BlockU) 

Block  2:  GPADnATE_FIIF 

NAME 

WR^P^  PPOGPAM  ='PHD' 

AND 

NAME  IS  ONE  OF  (BlockS) 
Block?:  APSENT_FILF  NAME 

Block  4:  STHDENT  FILE  TOTAL  CREDIT 


4.3  A  l§.n3ys.3§  x.y§i§.l  Z2A  Z§32ii§3§Z 


The  query  language  system  (QLS)  is  used  in  the  following 
way.  The  query  language  designer  determines  the  needs  of  the 
prospective  users  and  expresses  them  in  a  language  design.  The 
grammar,  in  t  ersp«^rsed  with  special  codes  wherever  special 
semantic  actions  need  to  be  performed. 


is  fed  into  the  compiler 
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generator  to  produce  syntax  tables  recognizable  by  the  parser. 
Seman-'-ic  action  routines,  written  in  the  DML/host  language  by  the 
designer,  are  input  to  th<^  HPL  compiler  for  generation  into  an 
intermediate  object  module.  This  is  then  linked  to  the  parser 
and  scanner  routines  of  the  compiler  generator  to  form  the  query 
language  compiler. 

At  compile  -t-ime,  the  generated  compiler  would,  with  ’^he 
aid  of  the  syntax  tabl^^-s,  parse  the  source  language  input  and 
create  the  appropriate  object  code.  When  the  SLS  object  code  is 
executed,  the  ?y?CTJTCP.  is  called  to  execute  the  data  base 
operations . 

In  addition.  to  facilitating  the  production  of 
specialized  SISes,  th^  QLS  provides  two  further  capabilities: 
extension  and  reduction.  A.n  SLS  may  have  syntactic  and/or 
operative  capabili-f'ies  added  to  it  by  regenerating  the  old  SLS 
with  the  added  syntax  and  semantic  routines.  Perhaps  more  useful 
and  surely  simpler  to  use  is  the  ability  to  restrict  the  syntax 
and/or  the  capabilities  of  an  SLS.  This  is  done  by  producing  the 
old  SIS  with  the  appropriate  syntactic  entities  deleted;  the 
semantic  routines  may  remain  the  same.  Restriction  has  the 
benefit  of  efficiency  and  security  since  the  expensive  and 
protected  features  can  be  dropped  from  the  sublanguage.  The  QLS 
makes  compiler  generation  easier  and  gives  an  element  of 
extensibility  to  the  compilers  so  produced. 

The  necessary  support  features  for  the  QLS  are  described 
in  section  4.3.1.  The  QLS  design  and  implementation  is  given  in- 
section  4.3.2  and  a  sample  SLS  query  language  produced  with  the 
QLS  is  presented  in  section  4,3.3. 
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22m  i2£2ort  For  Thf 

The  QL?  in  7FTA  is  designed  to  be  built  on  top  of  most 
r*B'’Ses  which  meet  three  basic  requirements.  "^irst,  a  complete 
command  set  should  provide  for  the  creation  and  destruction 
o-^  any  da-*:a  item.  Second,  a  complete  FML  command  set  should 
include  data  qualification,  using  searches  on  several  criteria, 
data  retrieval,  m odi f icat icn ,  insertion  and  deletion.  Finally,  a 
schema  query  facility  should  provide  sufficent  information  about 
any  data  item  so  that  semantic  checks  can  be  performed. 

Although  '•■hey  are  not  essential  for  the  production  of 
the  QIS,  the  following  fea-^ures  greatly  simplify  the  task.  First 
is  the  ability  of  the  QIS  and  the  DBMS  to  converse  by  character 
strings.  The  QIS  must  deduce  the  structure  of  the  data  since  the 
user  does  not  give  in.  This  interpretation  is  simplified  if  all 
the  data  is  in  s-^-rinq  form.  The  second  set  of  options  is  the 
operations  of  sort,  merge  and  join  plus  the  mathematical 
functions  sum,  max,  min,  average  etc..  These  operations  are  done 
more  efficiently  by  the  DBMS  because  it  may  use  inverted  lists 
and  the  fact  that  little  data  need  pass  out  of  the  data  base. 

4.3.2  The  Compiler  Generator 

The  QIS  compiler  generator  was  based  on  the  design  and 
implementation  of  TA.TOS  [Gaffney  1969,  18].  The  basic  philosophy 
behind  the  design  of  TACOS  is  to  facilitate  changes  in  both 
syntax  and  semantic  specifications  without  requiring  a  complete 
or  expensive  compiler  regeneration.  This  is  achieved  by 
separatina  the  syntax  tables  and  the  semantic  routines.  They  are 
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combined  form  a  compiler  at  compile  time,  that  is,  when  source 
tex-^  is  presented  for  compilation.  To  this  design,  a  number  of 
alterations,  op*^. imiza'^ ions  and  additions  were  made. 

The  QLS  is  a  compiler  generator  which  was  produced  using 
the  boot 2Stra ppir g  process.  Hence,  its  structure  is  similar  to 
its  product.  "^be  ’’driver”  module,  which  is  expected  to  be 
included  in  the  produced  compilers,  includes  the  parser  and  the 
scanner.  The  role  o-’^  -^-he  ’’driver”  is  to  read  in  the  syntax 
tables  and  so.urce  input,  and  to  call  on  the  parsing  routine  to 
parse  the  source  code.  The  module  containing  the  semantic 
routines  is  called  by  the  parser  at  appropriate  points  during  the 
parsing  process.  The  module  would  be  replaced  by  the  user’s  own 
semantic  routines  in  •’■he  produced  compiler. 

An  altered  Backus  Normal  Form  is  used  as  the  QLS  syntax 
language.  It  is  called  IBNF  (’’interpretive”  BNP)  by  its  designer 
[Gaffney  1969,  18]  because  an  interpretive  parsing  algorithm  is 
employed  in  the  recognition  stage.  Parenthesized  expressions 
were  introduced  to  reduce  the  size  of  syntax  specifications. 
Special  notation  which  denotes  the  number  of  occurrences 
allowable  for  the  associate  phrase  class  is  used  to  circumvent 
the  left-recursion  problem.  Another  construct  is  provided  for 
the  speci f ica-’-ion  of  semantic  routines.  The  grammar  for  IBN’=’  has 
been  reproduced  in  Appendix  A. 

The  scanner  is  capable  of  recognizing  five  token  types  - 
identifier,  unsigned  integer  constants,  quoted  character  strings, 
bit  strings  and  terminal  symbols  or  literal  strings.  The  scanner 
IS  actually  five  separa-’re  procedures  --one  for  each  token  type. 
There  are  four  predefined  nonterminals  available  for  users. 


They 
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are:  the  ider.tifier  (<*T>)  ,  the  integer  consrant  (<*N>)  ,  the 
quoted  string  constant  (<*S>)  and  the  bit  string  (<*B>) .  Whenever 
these  occur,  the  appropriate  scanning  routine  is  called, 
l^^entifiers  and  integer  constants  are  equivalent  to  those  in  PL/1 
and  '^INIZ. 

One  alteration  +:o  TACOS  was  to  restructure  the  product 
compilers  so  as  -^o  handle  multiple  compilations  in  one  "session”. 
This  is  to  provide  interactive  SLS  compilers. 

The  majority  of  usage  is  expected  to  be  by  non¬ 
programmers  for  non-production  or  one-time- only  requests.  Hence, 
the  stress  would  be  on  good  diagnostic  messages  and  no  stringent 
requirement  would  be  made  on  compilation  speed.  As  it  is  easier 
o  produce  good  runtime  'debugging  facilities  in  an  interpretive 
environment  and  because  a  good  part  of  the  execution  is  spent  in 
the  underlying  data  base  management  system  and  output  routines, 
an  interpretive  compiler  is  well  suited  to  this  system.  However, 
as  the  same  processes  are  often  repeated  on  each  record  of  the 
logical  data  base,  straight  interpretation  is  costly.  For  the 
sake  of  efficiency,  each  user  request  should  be  processed  in  two 
phases  --a  compilation  phase  where  the  source  "programme"  is 
translated  into  intermediate  code  and  an  execution  phase  where 
the  actions  specified  by  the  in-^e rmediate  code  are  carried  out  to 
produce  the  desired  result. 

A  major  change  to  TACOS  was  the  inclusion  of  a  general 
macro  facility.  From  past  experiences,  macro-like  facilities 
which  accommodate  alias  or  abbreviation  definitions  are  quite 
popular  with  the  casual  user  [Schuster  1973,  34],  Allowing  users 
to  replace  keywords  and  parts  of  or  whole  commands  with  words  of 


their  choice  actually  aids  them  in  -^he  communication  of  their 
requests. 


Macros  may  be  used  for  the  purpose  of  language 
extension,  language  translation,  text  generation  and  systematic 
edi-'-ing  [Brown  1969,  U].  The  facility  that  is  desired  here  falls 
into  the  language  translation  category  --mapping  onto  a  base 
language.  Text  and  computation  macros  [Cheatham  1966,  7]  are  not 
appropriate  for  an  SIS  macro  facility,  however,  syntactic  macros 
are.  SlUlactic  macros  are  detected,  replaced  and  analysed  during 
the  syntactic  analysis  stage  of  a  compiler, 

Syntac-^ic  macros  can  be  divided  into  two  types  --called 
SMA.CPOs  and  MACFCs  by  Cheatham.  SMACP.O  calls  are  only  recognized 
in  a  context  where  a  syntactic  class  may  occur.  Thus,  there  is 
an  association  of  macros  with  syntactic  classes  of  the  base 
language.  MACPOs  do  not  have  to  form  a  syntactic  class  of  the 
base  language  and  can  occur  anywhere  in  the  source  text.  Its 
presence  is  denoted  by  a  special  marker.  Since  MACROS  do  not 
depend  on  syntactic  structures,  they  are  best  suited  for  a 
general  macro  facility.  Henceforth,  macro  is  synonymous  with 
MA.CRO,  as  defined  by  Cheatham. 

In  the  provision  of  a  generalized  macro  facility,  four 
assumptions  have  been  made  in  its  design  and  implementation: 

1.  The  presence  of  a  macro  call  is  denoted  by  a  special  marker 
of  the  designer's  choice  followed  by  the  name  of  the  macro. 
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2.  -.ny  argumen-^?  -^c  the  macro  call  must  be  in  the  form  of 

literal  string  constants  and  arguments  are  enclosed  by  one 

pair  of  parentheses  and  follow  immediately  after  the  macro 
name. 

?.  Occurrence  of  a  parameter  is  denoted  by  a  special  marker 

(also  of  the  designer's  choice)  followed  by  a  numeric 

cons-^ant  which  lndica*:es  the  position  of  the  corresponding 
argument  in  the  macro  call.  This  type  of  identification  is 
known  as  designat ion_by_number . 

.  Methods  of  macro  definition,  cancellation  and  storage  are 
left  entirely  to  the  compiler  designer.  In  order  to  do  this, 
the  scanning  routine  assumes  the  existence  of  a  designer 
written  routine  (GFTMAC)  which  retrieves  the  macro  text  given 
the  macro  name. 

Wi-^h  the  above  assumptions,  the  macro  facility  provided 
can  be  quite  powerful.  Macro  calls  may  occur  anywhere  in  the 
source  as  long  as  they  can  be  expanded  into  syntactically  and 
semantically  recognizable  entities.  Nesting  is  allowed.  This 
can  be  achieved  by  the  presence  of  a  macro  call  within  another 
macro  text  or  within  an  actual  parame-^er. 

The  scheme  used  in  the  implementation  of  the  macro 
facility  is  very  similar  to  that  presented  by  Cries  [Cries  1971, 
1°].  To  facili+ate  macro  calls,  a  macro  buffer  and  a  macro  call 
stack  are  employed.  The  macro  buffer,  in  the  form  of  a  character 
array  is  used  to  hold  actual  macro  parameters  and  macro  texts 
which  are  active.  The  macro  stack  is  used  to  save  information 


about  the  scanring  source  (macro  or  input  buffer),  the  current 
scanning  point  in  the  macro  buffer,  location  of  the  last  valid 
character  in  the  macro  buffer  and  the  macro  call  parameters. 

There  are  several  faults  inherent  in  the  design  of  the 
macro  facili“^^y.  First,  the  task  of  managing  the  macro  file  has 
been  shif  +  ed  to  -^.he  compiler  designer.  When  a  macro  call  ocours, 
a  designer  written  routine  must  perform  a  file  search  or  use  the 
low=^r  DB!1S  •'"o  retrieve  -^he  macro  text.  Second,  during  the 
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Dafa  base  requests  car.  be  separated  into  two  components 
"■Letkovitz  1969,  23].  The  file  specification  sectiDn,  where  the 
user  provides  -^he  file  name  (s)  possibly  with  +he  conditions  for 
selecting  portions  of  the  file(s) ,  followed  by  the  action  clause 
in  which  the  user  specifies  wha-^  is  to  be  done.  The  format  for 
da-^a  base  requests  is; 

TOP  <f  ile_sp ecif  ica-^  ion>  <act  ion_clause> . 

Pa+a  base  requests  in  -^he  sample  language  are  similar  to 
those  in  the  P^^L  subl anguag<= .  "^ile  Specification  is  expressed 
using  a  where  clause.  In  addition  to  the  PML  actions  (create, 
delete,  destroy,  upda-^e  and  insert)  there  are  three  further 
actions:  merge/join,  sort  and  output /report .  These  additional 
actions  will  now  be  described,  breifly. 

'^he  merge/join  action  allows  two  files  or  portions 
thereof  to  be  merged  by  pairing  one  or  more  fields  in  the  two 
files.  Feques-^s  for  merging  files  are  submitted  by  means  of  a 
join  clause  in  which  the  linking  pairs  and  the  second  file  is 
identified.  Because  there  is  now  more  than  one  file  involved,  a 
field  name  may  no  longer  uniquely  identify  the  field  desired.  In 
order  to  resolve  ambiguities,  the  user  is  provided  with  the 
capability  to  qualify  field  names  in  the  create  and  output 
clause . 

The  unders-^  anding  of  information  can  be  aided  by  some 
form  of  ordering  of  retrieved  items.  Since  a  characteristic  of  a 
relational  da-^a  base  is  that  ordering  of  tuples  or  records  are 
irrelevant,  the  sorting  facility  is  restricted  to  the  output 
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clausG.  ■p'ields  may  be  sorted  within  other  fields  in  both 
increasina  and  decreasing  order  of  values. 

"I^he  output/report  action  provides  an  outpur  facility  for 
simple  lists  or  for  complicated  reports.  Within  rhe  output 
clause,  a  number  of  features  are  provided  to  facilitate  the 
generation  of  reports: 

l§2^iS§_specif icat ion  The  user  can  specify  the  output  device 

--rerminal,  printer  or  disk  file--  to  be  used.  If  the  output 
is  ro  be  stored  in  a  disk  file,  then  the  name  of  that  file 
must  be  -f^urnished  by  the  user.  In  the  event  than  no  device 
is  given,  oufpu-^  is  automatically  routed  to  the  terminal, 
report _head in q_spscificat ion  Up  to  five  lines  may  be  written 
by  the  user  as  the  report  heading.  A  user  has  the  capability 
to  control  fhe  loca-^ion  of  the  reporr  heading.  Report 
heading  may  be  centered,  lir.ewise,  on  a  separate  page  similar 
o  a  *^i-^le  page  or  printed  at  the  top  on  the  same  page  as  the 
rp-^rieved  in:^orma tion .  Columnwise,  the  heading  may  be  left 
or  right  jus-’-ified  or  centered. 

2olumn_headip2_specif icat ion  Up  to  five  lines  of  column 
heading  may  be  provided  by  the  user  for  each  item  to  be 
printed.  Column  headings  are  printed  at  the  top  of  each  page 
(except  the  title  page)  of  the  report. 

iis2l§.l_~orma t__specif  icat  ion  The  display  formats  which  can 
be  specified  by  the  user  include  spacing  control  and  item 
output  forma-^.  Unless  the  user  requests  otherwise,  two 
spaces  are  inserted  after  “^he  printing  of  each  field. 
Spacing  control  may  be  given  in  the  following  format: 


X  (  n  ) 
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whpre  n  is  ths  nuirber  of  soaces  to  insaroed  after  the 
^ield.  The  item  formats  which  are  available  to  users  are 
similar  to  those  in  PI/T.  lach  display  format  must  appear 
before  the  associafed  ifem  specification. 

"'he  macro  facility  provided  by  the  compiler  generator  is 
used.  Macro  names  are  delimited  by  the  character  %  and  may  be  up 
•^o  nhirty-two  characters  in  length.  Formal  parameters,  which  are 
delimited  by  the  character  P-,  can  be  used  in  the  macro  text. 
Macros  are  defined  In  '•’he  following  format: 

DF'^ri'TF  <macro_name>  t.S  <macro_t ext> . 

No  duplicate  macro  name  is  allowed.  Once  a  macro  has  been 
defined,  it  may  be  detected  by  the  presence  of  the  macro  name  in 
a  user  command. 

vhen  a  macro  is  no  longer  needed,  i**-  may  be  deleted  from 
the  user’s  macro  file  by  means  of  a  cancel  command  which  has  the 
followina  format: 

CANCFL  [Mf^.OFO]  <macro_na  me> . 

Questions  about  language  syntax,  macros  and  data  base 
schema  may  be  formulated  by  the  users.  The  format  of  inquiry  is 
as  follows: 

LIST 

WHAT  IS  <sub ject_of_inquiry> . 

WHAT  AF^ 

For  syr.'^ax  queries,  the  keyword  SYNTAX  is  used.  This 
keyword  may  be  succeeded  by  a  list  of  nonterminal  names  which  may 


or  may  not  be  enclosed  by  angular  brackets  (i.e.,  <  and  >  ).  If 
the  list  of  nonterminal  nam<=^s  is  not  present,  then  the  syntax  of 
every  produc-^ior  for  the  language  which  is  being  used  by  the 
questioner  is  lis-^ed.  If  nonterminal  names  are  specified,  then 
only  -^he  productions  for  -^hese  nonterminals  are  printed. 

Macro  aueries  are  denoted  by  the  keywords  MACROS  or 
MACRO.  The  keyword  may  be  followed  by  a  list  of  macro  names. 
The  omission  of  macro  names  will  result  in  the  listing  of  all 
macros  defined  by  the  user. 

There  is  a  variety  of  schema  queries  available  to  users. 
He  may  want  to  know  about  all  the  files.  In  this  case  the 
keywords  RILFS  or  FILF  would  be  used.  If  the  user  desires 
information  on  specific  files,  -^hen  he  would  list  the  file  names 
after  the  keyword.  ':^c  obtain  the  description  of  all  the  fields, 
the  keywords  PIFIPS  or  'P’IFLP  may  be  used.  A.s  for  files,  these 
keywords  may  be  followed  by  specific  field  names.  Users  may 
fur-^her  qualify  the  field  names  by  the  presence  of  a  file  name. 
If  a  user  does  rot  know  whether  an  identifier  is  a  field  name  or 
file  name,  then  by  naming  that  identifier,  information  on  any 
fields  or  files  with  that  name  will  be  provided. 

A_  sample  language  syntax  was  designed  (see  Appendix  B)  . 
To  test  our  theory  on  syntax  subsets,  several  versions  were 
produced.  The  capabilites  and  commands  which  have  been  discussed 
belong  to  one  version  of  the  syntax.  A.  subset  of  that  version 
was  produced  with  the  delete,  insert,  update,  destroy,  create  and 
join  commands  and  report  genera-^ion  facility  removed.  Another 
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su6s9t  omits  -^he  sort  facility,  macro  facility  and  inquiry 
■facility. 

Desiqn  and  IHlElem eni_at ion  of  the  Compiler 

The  task  of  producing  the  compiler  for  the  sample 
language  is  made  easier  by  the  use  of  the  compiler  generator. 
The  design  of  the  compiler  is  modular  in  nature  to  facilitate 
future  modifications  and  ex-^ensions.  A.s  discussed  earlier,  an 
in-^.er pre tive  compiler  is  best  suited  for  our  purpose;  thus,  the 
compiler  is  an  interpretive  one  with  two  work  phases  --  code 
generation  and  code  execution. 

Structures  and  Variables 

Aside  from  the  variables  and  structures  needed  by  the 
compiler  genera-’ror  [Gaffney  1969,  1  8]  and  the  "OEMS  [Leong  1  974, 
24],  there  is  a  number  of  files  and  structures  utilized  by  the 
compiler  to  generate  and  execute  intermediate  code. 

A  sequential  file  containing  the  syntax  tables  and  an 
indexed  direct  access  file  containing  system  messages  are 
maintained  on  disk  by  the  compiler  designer.  The  computer 
operating  system  is  expected  to  furnish  the  algorithms  for 
accesses  to  these  files. 

To  enable  the  incorporation  of  multi-syntax  selection 
and  the  provision  of  the  inquiry  facility,  four  system  files  are 
used.  If  the  compiler  is  to  maintain  these  files,  it  would  be 
n‘='cessary  for  the  designer  to  provide  schemes  for  storage  mapping 
and  access.  "^o  simplify  the  designer's  task,  the  maintenance  of 
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•these  files  has  been  relega'ted  to  the  DBMS  and  retrieval  of  the 
information  is  ••■hrough  generated  DBMS  commands.  Hence,  the  four 
system  files  are  represen -^.ed  as  relations  in  the  data  base. 
Information  may  be  obtained  from  these  files  through  the  data 
structure  languag*=^. 

The  first  file,  USER_DI?BCTO?.Y,  is  the  directory  of 
query  system  users.  Z*-.  contains  information  about  a  given  user's 
language  syntax  capability  and  the  name  of  his  macro  file.  There 
are  four  fields:  a  user  identification,  a  code  for  the  user's 
"world"  and  "level"  of  syntax  and  the  macro  file  name.  The 
syntax  levels  have  been  structured  in  such  a  way  that  the 
language  capability  increases  with  the  increase  in  level  value; 
within  a  user  view,  the  levels  are  actually  progressive  supersets 
of  preceding  ones.  ?^.lso,  i-^  is  possible  for  the  user  to  have  an 
interest  in  more  than  one  world  of  view. 

The  USFF_FILF  is  a  collection  of  all  QLS  users'  macro 
files.  When  a  user  defines  a  macro,  it  is  stored  in  this 
relation  under  his  "file  name".  The  system  then  uses  this  file 
to  expand  any  macro  calls  encountered  in  a  user  request.  Each 
entry  contains  the  user's  "file  name"  plus  the  name  and  text  of 
one  macro.  To  protect  users  and  to  avoid  confusion,  a  user  can 
only  use  macros  defined  under  his  code.  Perhaps,  a  public  file 
may  eventually  be  created  so  tha-^  a  collection  of  public  macros 
may  be  available  to  users. 

The  SYNTt.X  file  is  main'^ained  to  aid  users  in  the  use  of 
the  language.  Any  inquiries  concerning  syntax  of  the  language 
would  be  satisfied  by  the  system  through  this  file. 


The  four 
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fi'=13s  in  this  relation  are:  codes  for  the  "world”  and  "level"  of 
the  user’s  view  plus  left  and  right  hand  sides  of  one  production. 

finally,  the  EFL?^TION_PFSCHIP'^ION  file  is  needed  to 
provide  schema  query  capability  ro  -^he  QIS  users.  This  file 
should  really  be  maintained  by  the  underlying  DBMS;  rhat  is,  if 
the  DBMS  provides  schema  query  facilities,  then  this  relation  can 
be  discarded.  The  three  fields  of  this  file  are:  a  relation 
name,  a  domain  name  and  a  textual  description  of  the  domain  or 
relation . 

Similar  to  macro  calls,  every  occurrence  of  data  base 
identifiers  requires  information  retrieval  from  the  DBMS  in  order 
to  make  checks  for  existence  and  type  compatibility.  As  data 
base  identifiers  appear  frequently  in  data  base  requests,  *-he 
high  overhead  cost  associated  with  DBMS  access  makes  frequent 
calls  to  the  DBMS  for  schema  information  undesirable.  The  DBMS 
is  expected  to  be  a  multi-user  system;  hence,  the  schema  table 
may  be  volatile.  It  would  be  dangerous  to  retrieve  and  keep  the 
schema  table  at  -^he  higher  level  for  any  great  length  of  time.  A 
compromise  is  made  to  resolve  this  predicament.  Schema  access 
through  the  DBMS  is  performed  only  for  file  names.  For  each 
occurrence  of  file  names,  the  DBMS  routine  RELATION_INFO  is 
called  to  obtain  the  name  and  the  type  of  all  the  fields  in  that 
file.  This  information  is  stored  in  a  structure  called 
■^ILF_INFO.  Since  the  syntax  is  structured  in  such  a  way  as  to 
allow  scope  distinction  of  file  names,  the  arrangement  and  access 
of  this  structure  is  facilitated.  FILF^INFO  structures  are 
linked  in  a  LIFO  lis*:  .  when  the  scope  containing  a  file  name 
terminates,  its  associated  FILF_INFO  structure  which  is  located 


65 


the  head  of  th 
file  is  kept  at 
of  the  request  ir 
arrangement,  no 
searched  and  with 
are  needed  -^o  va 
search  alqori-^hm 


e  list  is  freed, 
■^he  higher  level 
which  th.a*^  file 
more  than  two 
in  -^he  structure 


lida^e  a  field  na 


is  used. 


Thus,  the  information  on  a 
for  no  longer  than  the  duration 
is  named.  Due  to  the  scope 
?ILE_INFO  structures  need  to  be 
no  more  than  twelve  accesses 
me.  Hence,  a  simple  sequential 


store 
struc 
r  et  ri 
and  S 

are  g 


•^he  ■p’IIE_IMFO  structure  is  also  used  to  tempora 
other  information  such  as  the  location  of  the  where  cl 
ture  for  the  file,  the  number  of  fields  from  that  file  t 
eved,  e-^c.  This  information  is  used  to  generate  the  CON 
UBCONHand  struc'^ures  required  by  the  lower  level. 

Finally,  the  structures  of  the  data  structure  lang 
enerated  t^or  communication  wi-^h  the  EXECUTOR. 
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Understanding  of  the  routines  which  compose  the  QLS 
compiler  and  interpreter  may  be  enhanced,  if  the  reader  is 
acquainted  with  ■'“he  basic  processing  algorithms  used  by  QLS.  For 
each  type  of  QLS  request,  the  basic  algorithms  followed  are 
described  in  terms  of  commands  passed  into  the  DBMS.  The  DBMS 
commands  are  presented  in  the  form  required  by  the  DML  compiler 
with  the  keywords  underscored  and  values  which  are  to  be  supplied 
in  small  letters.  Recall  that  the  actual  commands  submitted  to 


the  DBMS  are  in  a  data  structure  form 
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•  Validate  nser-It': 

USFPIP  IS  userid  password. 

This  command  will  cause  the  DB!^S  to  search  its  user 
directory  to  validate  the  user’s  authorization.  If  the  identity 
is  okay,  then  the  user  is  effectively  on  active  status. 

•  To  obtain  user  syntax  capabilities: 

CPPI-TF  SYSFILE  FROM  USF  R_DIRECTORY 

ESFPVIFV  DSERLEVL  FILENA?1E 
WHERE  nSFPCCrF  =  userid. 

GF^_NFX2  SYSFIIF  USEPVIEW,  USEPLEVL,  FILENAME. 
PESTPCY  RELATION  SYSFILE. 


rr 

he  first  command 

will 

caus 

9  a  search  of  the 

QLS’ 

user 

directory 

for  the  entry  which 

has 

t  he 

userid  as  the 

key. 

If 

retrieval 

is  successful,  then 

the 

GET 

NEXT  command  will 

cause 

the 

desired  values  to  be  passed  to  QLS.  To  complete  the  operation,  a 
DESTROY  command  is  issued. 

Most  data  base  requests  are  reformulated  directly  into 
commands  to  the  DBMS.  The  exception  to  this  is  the  data 
retrieval  command.  For  this  request,  the  query  language  system 
needs  to  issue  at  least  five  commands  to  the  DBMS.  Files 
relevant  to  the  retrieval  process  must  be  opened;  hence,  an  OPEN 
command  is  issued.  A  temporary  file  containing  all  records  which 
satisfy  the  user’s  selection  criteria  is  created  using  the  CREATE 
command.  For  each  record  in  the  temporary  file,  a  GET_NEXT 
command  must  be  given  to  obtain  the  values.  When  all  records 
have  been  processed,  the  temporary  file  is  wiped  from  the  data 


base  with  the  DEFTPOY  command 


Finally,  all  the  files  which  have 


been  opened  are  closed. 

•  On  macro  definition: 

CPEA.TE  FEL^.TTOF  SYSETLE  flPOM  USER_FILES  MA.C^BODY 

WPFPE  !^?--C_NAME=macr o_name  AND  FILEID=f i lena me. 

DESTROY  RELATION  SYSEILE. 

JNSERT  into  ITSER^EIIES 

MAC_NANE  WITH  VALUE  macro_name, 

MAC_PODY  with  value  macro^text, 

T^ILEID  WITH  VALUE  filename. 

When  a  user  wishes  to  define  a  macro,  the  user's  macro 
file  is  first  searched  by  means  of  the  above  CREATE  command  for 
possible  name  duplication.  If  the  macro  name  furnished  by  the 
user  is  already  in  use,  then  a  duplicate  name  is  noted  and  a 
DESTROY  command  is  used  on  the  new  name.  Otherwise,  the  new 
record  containing  the  user's  new  macro  is  added  onto  the  macro 
file  via  an  INSERT  command. 

•  On  macro  usage: 

To  expand  a  macro  call,  a  CPEATE  command  identical  to 
that  used  for  defining  a  macro  is  issued.  If  such  a  macro  exists 
(i.e..  The  DBMS  re-^urns  a  successful  indicator  with  one  record 
satisfying  the  condition) ,  then  a  GET_NEXT  command  which 
retrieves  the  values  and  a  DESTROY  request  are  consecutively 
emitted . 

•  On  macro  cancellation: 

DELETE  FPON  TtsEP_EILES  WHEPE  MAC_NAME=macr o_nam e 

AND  EILEID=filename  . 

The  DELETE  command  is  issued  to  remove  the  record 
associated  with  +hat  macro  from  “^he  macro  file. 


iD-SillSY.  f.£2]li§i§ 

Inquiry  requests  are  actually  degenerate  data  base 
retrieval  commands  because  the  files  which  are  used  to  supply 
information  are  implemented  as  relations  in  the  data  base.  They 
are,  in  reality,  more  simplified  forms  of  retrieval  as  the  users 
need  supply  only  field  values.  The  system  automatically  supplies 
■^he  file  and  field  names  and  formul ate s  the  retrieval  command  to 
the  DBMS  in  such  a  way  that  ■'-he  correct  information  will  be 
obtained.  The  formulation  of  the  CBBATB  command  varies  with  the 
subject  of  inquiry  as  follows: 

•  for  macro  information: 

CFFATF  PFLATION  SYSFILE  FPOM  ttsfP_FILES  MAT_NA.MF, 
MA.C^BODY  WHFFE  FILEID=f ilename  [  MD 
MAC_N?-.MF  IS  CNF  OF  (macro_name,  ...)  ]. 

If  the  user  does  not  furnish  any  macro  names,  then  the 
condition  enclosed  in  square  brackets  would  be  omitted. 

•  for  syntax  information  : 

CPF^TF  P ELATION  SYSFILE  FPOM  SYNTAX  LHS,  RHS 
WHFPF  WO FLD=uservie w  AND  IF VFL=user le ve 1 
[AND  LHS  ^S  ONF  OF  (nonterminal  name,  ...)]• 

If  the  user  does  not  furnish  specific  nonterminal  names 
then  the  condition  enclosed  in  the  square  brackets  is  omitted. 

•  for  schema  information  : 

One  of  the  following  CREATF  commands  is  applicable: 

(1)  CREATE  PFLAII^^  SYSFILE  FPOM  RFLATION_DESCEIPTION 
RFL^NAMF,  DOM_NAMF,  TFXt”dFS 
WH^RF  RFL_NAMF  IS  ONE  OF  (file  name,  ...) 

AND  DOM_NAMF=blank  value. 

(2a)  CPFATF  PFLAT^ON  SYSFILF  FROM  RFLATION_DE SCR IPTI ON 
^FL_NAM^,  DOM_NAMF,  TEXT^DFS 
WHFPF  DOM_NAMF  IS  ONE  OF  (field  name,  ... 

TAND  PFI_NAMF=f ile  name]. 


) 
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(2b)  CV7A21  SYSFILS  ZEQE  PYLATION_DESCRIPTION 

REI_NAf1^,  D01«'_NAWE,  TEXT_DFS 
WFFPE  DOF_NAFE^=blar.k  value 
fAFD  P.FL_FAWE=file  name]. 

(3)  CPFA^F  PELAIIQN  SYSFILE  F^OF  PELATI0N_DESCRIFTI0N 
PFT_FAMF,  DOF^FAMF,  text_des 

WHFPE  (FFL_FAMF=iiame  AND  DOM_NAME=blank  value) 

OR  DOM_FAFF=name . 

The  first  CREATE  ccmmand  would  be  used  to  retrieve 
descriptions  of  files.  Ecr  field  information,  2a  would  be  used 
if  the  user  specifies  the  field  names;  otherwise,  2b  would  be 
used.  If  the  user  does  not  distinguish  a  data  base  name  as  a 
field  name  or  file  name,  then  the  third  command  would  be  issued. 
Those  phrases  which  are  enclosed  within  square  brackets  would  be 
omitted  if  the  user  has  not  furnished  the  relevant  information. 

No  security  check  has  been  incorporated  into  the  schema 
query  facility.  The  information  in  PELATION_DESCRIPTION  is 
available  to  all  authorized  QIS  users.  After  all,  having  the 
knowledge  that  information  exists  is  useless  to  anyone  unless 
that  person  can  get  a-*-  the  actual  content. 

EA2£§li?Ees  Used 

The  description  of  the  procedures  which  are  used  for  the 
compilation  and  execution  of  user  requests  is  presented  according 
to  the  functions  served  by  the  routines. 

■  The  "Driver" 

QSYSTEM  is  the  "driver"  module  of  the  query  language 
system.  The  functions  of  the  QIS  driver  are  as  follows: 

1.  obtains  user  identity  and  password; 

2.  validates  user’s  identity  with  the  DBMS  via  the  command 

described  in  the  last  subsection; 
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?.  obtains  from  '•■he  user  directory,  the  user's  syntax 
capability  and  the  name  of  his  macro  file; 
reads  in  the  syrtax  -^ables; 

5.  for  each  user  request: 

opens  -^he  system  files  in  the  DPMS  if  applicable, 
reads  request  into  the  input  buffer, 

initializes  the  scanning  variables, 
calls  the  parser  to  compile  the  user  request, 
prints  ou-^.  compilation  time  usage, 

if  compilation  is  successful  but  if  syntax  or 
semantic  errors  occurred  then  obtains  go-ahead 
signal  from  user;  before  calling  on  the 
interpreter,  closes  the  system  files  and  clears  the 
compilation  working  area;  on  return  from  the 
interpreter,  prints  execution  time  usage  and  clears 
all  working  areas. 

if  compilation  is  unsuccessful,  prints  a  diagnostic 
message ; 

6.  signs  user  off  the  system. 

■  Semantic  Poutines 

The  principal  purpose  of  the  semantic  routines  is  to 
emit  intermediate  code  for  the  interpreter.  Part  of  the 
intermediate  code  consists  of  commands  to  the  DBMS  which  is 
stored  in  a  data  structure  comprehensible  to  the  EXECUTOR.  Thus, 
much  of  the  work  performed  by  the  semantic  routines  is  towards 
the  assembly  of  relevant  information  into  these  data  structures. 
The  most  complicated  to  be  buil*  is  the  representation  for  the 
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whers  clause.  To  aid  ir.  this  -^ash,  a  pointer  stack  and  four 
subroutines  are  used.  The  PUSH_PTT_STA.CK  subrourins  is  needed  to 
enter  a  pointer  on  •’■o  the  pointer  s-^  ack  after  making  a  check  for 
stack  overflow.  is  used  to  store  the  relevant 
information  into  a  structure  whose  location  is  then  pushed 
onto  the  pointer  stack.  EnTLP_POOLFAN_OPE?.PTOR  is  used  to  build 
up  an  OPEPATOR  structure  using  the  top  two  elem.ents  of  the 
pointer  stack.  Lastly,  BUILD_cH0TCE__C0NDITI0N  is  needed  to  build 
the  structure  which  is  a  combination  of  LEAF  and  OPERATOR 
structures  for  the  TS  ONE  OF  and  IS  NOT  ONE  OF  comparison. 
Information  destined  for  the  other  structures  are  collected  and 
entered  at  the  appropriate  time. 

For  macro  definitions  and  cancellations,  only  a  DBMS 
command  (INSEF'^’  and  priETF  respectively)  which  has  been  dis cussed 
in  the  las-*-  subsection  needs  to  be  formulated.  The  commands 
needed  to  check  tor  the  duplicate  use  of  macro  name  would  have 
been  issued  in  ore  of  the  semantic  routines  during  compilation  of 
the  request;  hence,  they  would  not  be  stored  as  intermediate 
code.  In  the  case  of  inquiry  requests  which  require  information 
to  be  printed,  •'■he  identity  of  the  output  routine  to  be  used  is 
passed  along  with  the  appropriate  DBMS  command  ro  the 
int  er  pre  ter. 

For  data  base  requests,  a  larger  work  load  is  required. 
All  data  base  names  must  be  validated.  Field  values  must  be 
checked  for  type  compatibility  and  if  necessary,  converted.  With 
the  exception  of  information  requests,  all  data  base  requests 
need  only  the  construction  of  the  appropriate  DBMS  command.  In 
the  case  of  information  retrieval,  intermediate  code  for  the 
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report  g€r:era'*'ior  facility  must  be  issued  in  addition  to  the  DEWS 
rommar.d  (CFEJ.'^F  and  TF FAT F- JOIN)  and  the  identity  of  the  output 
roiT^ine  and  the  ou“^put  device  to  be  used.  The  OUTPUT_ITEM 
strurture  contains  information  concerning  the  location  of  the 
item  (either  from  the  constan*^  pool  or  from  the  DBMS),  the 
conversion  algorithm  -^o  be  used,  the  print  format  including  ^^he 
field’s  s'*:art  pr:  r."*-  column,  the  format  type  and  amount  of  spacing 
to  be  inserted  af-^er  the  field,  and  the  column  heading.  For 
efficiency’s  sake,  any  type  conversion  needed  for  constant  values 
is  performed  only  once  in  the  compilation  stage. 

•  +-he  interpreter 

XFQ  is  -^he  interpreter  routine  for  the  compiler.  It 
contains  four  ou-^put  subrou'^ines  —  one  each  for  the  output  of 
syntax  inquiry,  macro  inquiry,  schema  inquiry  and  data  base 
information  retrieval.  Commands  are  grouped  into  tvo  types 
those  requiring  information  output  and  those  that  do  not.  If  no 
output  of  information  is  needed,  then  the  task  of  the  interpreter 
consists  of;  the  assembly  of  the  DBMS  command  structures;  the 
submission  of  the  command  to  the  DBMS  and  a  check  that  the 
command  is  successfully  executed;  if  not  successfully  execated, 
an  error  procedure  is  called  to  print  the  DBMS’s  diagnostic 
message  and  abort  execution. 

For  requests  which  require  output,  the  following  functions 
are  performed: 

1.  open  -^he  relevant  files  with  an  OPEN  command  to  the  DBMS; 

2.  assemtl<='  the  CPFATF  or  CFFATE-JOIN  command  structures; 

3.  submit  the  command  to  the  DBMS; 
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U.  If  ^^xpcution  of  ccmip.and  is  successful,  then  the 
appropriate  output  routine  is  called;  otherwise,  the 
error  procedure  is  called; 

f.  emit  a  PIFTPOY  commend  to  the  DBMS  for  the  file  created; 

6.  emit  a  CIOSB  -^c  the  DBMS  to  close  the  opened  files. 

In  -^he  output  routines,  a  check  is  made  for  the  number 
of  records  retrieved  by  t h^  CPB?^TE  command.  For  each  record  in 
tbe  temporary  file,  a  GET_PFXT  command  is  submitted  to  the  DBMS 
to  obtain  the  values.  The  only  difference  between  the  output 
routines  is  the  ou“^put  format  used.  For  the  inquiry  requests, 
the  type  of  information  to  be  printed  is  fixed;  hence,  those 
routines  involved  in  the  output  of  inquiry  information  need  only 
prin^  the  information  according  to  predefined  formats.  The 
rou'-ine  which  cu-^puts  da^^a  base  information  according  to  user 
specification  has  a  more  complicated  task  consisting  of  the 
following  functions: 

1.  open  the  appropriate  output  file  if  applicable; 

2.  print  the  repor*:  heading  if  requested; 

3.  print  the  column  headings  if  applicable; 

4.  for  each  record  retrieved: 

check  for  end  of  page  and  print  the  column  heading 
if  needed, 

follow  the  list  of  OUTPUT_TTFM  structures  and  print 
the  associated  value  with  the  format  specified, 

5.  close  the  output  file  if  necessary. 


p  u'^ility  rcutirf^s 

are  sp^ven  utiliry  routines  which  may  be  called  bv 

the  procedures  discussed.  They  serve  the  following  purposes: 

1.  abort  compilation  or  execu-'-ion  attempt; 

2.  print  diagnostic  messages; 

5.  perform  '^ype  conversion; 

U.  release  soiro  DPf"S  command  structures; 

5.  retri.eve  macro  tp^xt  for  macro  call  expansion. 

Language  Zlgcilities  Summary 

This  chapter  has  described  the  language  facilities,  the 
highest  logical  level  of  ZPT^ .  Self-contained  language  systems 
(SLS)  have  beer  distinguished  from  host  prograirming  language 
systems  (HIS)  .  '^he  design  and  implementation  of  an  HLS  using  a 
preprocessor  and  a  TMI  compiler  has  been  ou*--lined.  e.  query 
language  generatina  system  (QLS)  which  produces  compilers  for 
self-contained,  problem  oriented  languages  has  been  described. 
Finally,  a  sample  query  language  was  designed;  ±z  is  currently 
undergoing  i  mplem.ent  a  “ion  using  the  QLS. 
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CHAPT'PF  5 

Current  Status,  Future  Directions  and  Summary 

ZFT?--  is  an  on-going  project  for  further  research  and 
development  of  relational  DBFS  implementations.  Currently,  the 
majority  of  procedures  for  each  level  described  in  this  report 
has  been  coded  for  an  TPM  3^0  computer,  however,  they  are  not 
completely  debugged.  Implementation  is  continuing  including  the 
incorporation  of  inverted  lists  f  Farley  17].  A  subset  of 

the  language  features  and  an  inverted  list  search  mechanism  are 
expected  to  be  operative  by  July  1975. 

ZFTt  is  also  operating  as  a  DPMS  for  an  artificial 
intelligence  project  called  TOFUS.  TOP.US  is  investigating 
natural  language  understanding  employing  semantic  netwarks. 
Curren‘^-ly,  it  handles  simple  sentences  to  retrieve  data  from  a 
student  information  data  base.  MINIZ  is  being  used  to  support 
TOPUF,  however,  a  higher  level  interface  may  be  used  in  the  near 
f  u-^^ure . 

As  a  resul-*-  of  the  experience  with  ZETA,  another 
relational  DPMS  design  project,  called  OMEGA,  is  beginning.  Any 
implementation  will  be  on  a  PDP-11/45.  The  goals  of  OMEGA  are 
somewhat  differen*^-  from  those  of  ZETA.  OMEGA  is  investigating  a 
network-related  approach,  hence,  it  uses  some  additional  basic 
mechanisms  not  presen*^  in  ZETA. 

Once  ZFTA  is  operating  as  a  unit,  research  regarding 
specific  expansions  will  domina-^e  the  project.  These  include: 

1.  add  more  single  and  multiple  relation  operations. 
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2.  extend  the  snapsho-^  facili-^y  for  better  updare 
oriented  access, 

?.  extend  the  mark  facility  with  extra  mark  operations, 

4.  cor.'^inue  experiment  with  search  optimization  features, 

6.  add  features  to  the  user  interfaces 
(e.g.,  arithme-^ic  functions),  and 

6.  experiment  wi^h  various  applications. 

Summary 

The  ZFTf--  project  attempted  to  design  and  implement  a 
relational  DPMS  in  three  distinct  parts.  MINIZ,  nhe  lowest 
level,  provides  a  basic  relational  system.  It  enables  the 
representation  and  manipulation  of  primary  and  derived  relations 
on  a  single  relation,  single  pass,  tuple  at  a  rime  basis.  The 
main  fea+ure  of  this  level  is  the  mark  mechanism,  a  unary  pointer 
relation,  which  is  used  to  provide  access  to  subsets  of  data. 
The  FXFCTJTOT,  the  intermediate  level,  provides  a  multiple 
relation,  multiple  pass,  relation  oriented  view  of  the  data  base 
for  the  language  facilities  level.  It  acceprs  high  level 
commands  which  it  executes  using  MINIZ  primitives.  The  main 
features  of  this  level  are  the  multiple  relation  view  and  the 
abili-^y  to  derive  new  relations.  Finally,  the  language 
facilities  level  provides  the  highest  logical  view  of  the  data 
base  through  a  host  programming  language  system  (HLS)  and  a  ^uery 
language  system  (OIS)  .  The  QLS  is  a  system  which  enables  the 
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productior.  of  self-contained,  problem  oriented  langua 
with  reduced  effort. 

As  a  relational  PBMS,  ZET?^.  will  offer 
relational  data  base  at  three  successive  levels  of 
abstraction.  VT'TP.  is  continuing  as  a  project  to 
develop  a  r^^lational  DBMS.  Further,  it  will  provide 
for  pedagogical  and  experimental  purposes. 
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APPENDIX  A 


The  purpose  of  this  appendix  is  to  aid  the  rea 
in  the  understanding  of  the  concepts  used  by 
compiler  Generator  as  discussed  in  Chapter  4. 
i-^ems  are  reproduced  (with  some  changes)  from 
technical  report  by  Gaffney  [  18  ]. 


ders 

the 

The 

the 
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(  1)  <ibr.f _syntax>  :  :  = 

(  2)  <rcontrol_card>  :  :  = 

I 

I 

\ 

I 

(  3)  <productior.>  :  :  = 


<cor.  +  rol_car d>*  <prodiict ion>  + 
('END*  I  'PEGFN') 

'CMA.FG*  *=' 

*  (*  <*N>  '  ,  '  <*N>  ' )  ' 

*  ;?^ '  'C TITLE*  *  =  *  <*S> 

*$*  *MACPO* 

*$*  'LEAD MAC*  *=*  <*S> 

*$*  'LEADAPG*  *=*  <*S> 

<phr ase_class>  *::=* 

<d0f  ini ticr.>  *  ;  * 


( 

( 

(  6) 


<phrase_class> 

<ph  rase_class_r.an)  e> 


<def inition> 


*<*  <phrase_class_naine>  *>* 

<♦!> 

^aliernat  ive>  (*1*  <alternat  ive>) 


(  T)  <altGrnativG> 


<itGin>+  <action  routinG>? 


(  8)  <itGin> 


=  (  <phrasG_class> 

I  *  (*  <definition>  *)  *  ) 

<repeat_charact er >? 

1  <sGmantic_t9St> 

1  <t  Grniinal_sy  mbol> 


(  9)  <act ion_routinG>  :  :  = 

(10)  <rGpGat_charactGr>: := 

(11)  <sGir.an  tic_t  est  >  ::  = 

(12)  <terminal  symbol>  : := 

1 


<*N> 

I  « I  I  I  7  I  j  I  +  I 

*<*  <act ion_routine>  *>* 

*<*!>*  I  *<=«‘N>*  I  *<*B>* 

*<*S>*  I  <*S> 


Note:  ThG  syntax  is  dsscribGd  in  IPNE  syntax  format. 

Notations  used  are: 

*  -  may  occur  ZGro  or  more  times, 

-  may  occur  one  or  more  times, 

?  -  may  occur  zero  or  one  time, 

<*I>  -  denotes  occurrence  of  an  identifier, 

<’«'N>  -  denotes  occurrence  of  a  number, 

<*S>  -  denotes  a  quoted  character  string, 

<*B>  -  denotes  occurrence  of  a  bit  string; 
terminal  symbols  are  enclosed  within  quotes. 
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APPENDIX  B 


A  Sample  SufEI  language:  Syntax  and  Examgles 


In  order  to  -^est  and  illustrate  the  proposed 
approach,  an  example  is  used.  This  appendix 
contains  ■‘■he  syntax  for  the  sample  query  language, 
sample  instances  of  the  system  relations 
examples  of  auery  language  requests. 


and  some 


p.  1  sy^zLI  of  ;^ff  sample 


(  1)  <ussr  ccn'mard> 


<dat a_base_regu  est> 
<macro_dGf  ini-tl  on> 
<ir:guir  y> 


(  2)  <macro_def ir.i tior>  ;:=  ’DEFINE  '  <macro_name> 

*AS  *  <inacro_tsxT:> 
I  ’CANCEL  ’  (’MACEO  ’)? 
<macro_name> 

(  3)  <inacro_r-ame>  :;=  <*I> 


(  U)  <inquiry>  :  :=  (’LIST  ’  I  ’WHAT  ’  (’IS  ’  ]  ’ARE  ’)) 

<  sub jGct_of_inguiry> 

(  5)  <sub j ecb_of _ir.gu iry >  ::=  ’SYNTAX  ’  <syr. tax_list >? 

I  (’MACRO  ’  1  ’MACROS  ’) 

<macro_name_list>? 

I  <data_bas e_guestion> 

(  6)  <syntax_lis-*' >  ::=  <syr:b  ax_r.am‘=>  (’,’  <synt  ax_naine>)  * 

(  7)  <syn'^ ax_namG>  ::=  (*<')*?  <*I>  ('>*)? 

(  8)  <niacro_r.aine_lisb>  :  :=  <inacro_name>  (’,’  <macro_namG>)  * 


(  9)  <da ta_base_cuest ion>  ::=  (’FILES’  |  ’FILE  ’) 

<dat  a_basG_naines>? 

I  (’FIELDS  ’  I  ’^lELD  ’) 

<dat  a_basG_naines>? 

(’IN  *  <filG>? 

<dat  a__base_id9nt  if  is  r>)  ? 
1  <data_base_identifier > 

(10)  <data_base_r.aines>  :  :=  <dat a_base_identif ier> 

(’,’  <data_base_ident  if  ier  >)  * 

(11)  <data__base_idenf  if  ier>  ::=  <*I> 


(12)  <data_base_rGguest>  :;=  ’FOP  ’  (((’EACH  ’)  ?  ’RECORD  ’ 

1  ’RECORDS  ')  'IN  ’ 
<file>?  ) ? 

<f ils_specif ication> 

<act ion_clause> 

(13)  <f ile_specif icat ion>  ::=  <data_base_identif ier> 

<where  clause>? 


(14)  <act ion_clause> 


::=  ’DELETE  » 

I  ’DESTROY  ’ 

I  (’UPDA.TE  ’  I  ’INSERT  ’)  <with>? 
<field  name  valu6> 


I 


(*,»  <f ield_nam€_value>) * 

<  join_cla\isG> 

(<crGate_clause>  |  <cutput _clause>) 

(15)  <f  ield_naTne_value>  ::=  <data_base_ident  if  iGr> 

(*  =  •  I  'EQUAL  *  (  'TO  ')  ?) 

<GX  prGssion> 

(16)  <ioir._clausG>  ::=  'COUPLED  ' 

('BY  '  <linking_pair> 

(<SGparator>  <linking_pair >) *) ? 
'TO  '  <file>?  <f ilG_specif icat ion> 

(17)  <crGatG__clausG>  :  :=  'CPEATE  '  ('NEW  ')? 

<da+-a_basG_idGr.  tifiGr> 

'WITH  '  <nG w_nainG_val UG> 
(<SGparator>  <nGw_nainG_val ug>)  * 

(18)  <nGw_riainG_valuG>  ::=  <f iGld_namG>  ('PENAMED  ' 

<dat  a_basG_identif iGr>? 

(19)  <li nkir.g_pai r>  ::=  '('  <data__basG_idGnt ifiGr>  ',' 

<da ta_basG__idGnt ifiGr>  ')  ' 

(2C)  <f iGld_namG>  :;=  <data_basG_idGntif iGr> 

('EPON  '  <data_basG_idGntif iGr>)  ? 

(21)  <filG>  :  :  =  '  -^ILB  » 

(22)  <with>  ::=  'WITH  ' 

(23)  <s‘?parator>  ::=  'AND  '  1  ',' 

(2ii)  <wher G_clauGG>  'WFEPE  '  <boolGan_con dit ion> 

(25)  <boolGan_cor.  dition>  ::=  <boolGan_term> 

('OP  '  <boolGan_tGrm>) * 

(26)  <boolean_tGrn'>  :  :=  <boolGan_f actor> 

('AND  '  <boolGan  factor>)  * 

(2B)  <boolGan_f actor>  ::=  *  ('  <boolGan_condition>  ') ' 

I  <boolGan_primary> 

(28)  <boolean  pr:.inary>  :  ;=  <dat  a_basG__id entif  ier> 

(  <comparG_opGrator> 
<GxprGssion> 

I  'IS  '  ('  NOT  ')  ?  '  ONE  '  '  OF  ' 

'  ('  <value_list>  ')  '  ) 

(29)  <coinparG_opGrat  or>  ::=  '-•='  j  '-•<'  |  '>=' 

I  '=>'  I  '<='  I  '=<' 

I  '->'  I  '='  I  '<'  I  '>' 

(30)  <valuG_lisf>  ::=  <GxprGssion>  (','  <GxprGssior >) ’!' 
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I  (MJNTQU?  ')?  <aata_bas0_id0ntif i5r> 

'TN  *  <f ile_specif ication> 

(31)  <expression>  ::=  <*B>  !  <*S>  |  <*N> 

1  <data  base  ider:tifier> 

(32)  <ou tput_clausG>  :  :=  ’OUTPUT  *  <o\itpub__device>? 

<report_heading>?  <display_lisr>  ? 
<sort_clause>? 

(33)  <outpuT_devicG>  ;:=  ’TO  ’  <device> 

(3ii)  <devic0>  ::=  ’TFPf^INAI  ’  |  ’PP.INTFP.  ’ 

I  ’PTLF  ’  <*!> 

(35)  <repcrt_heading>  ::=  <with>?  (’PFPOPT  ’) ?  ’HEADIN3  ' 

<heading>  (’SEPAPATED  ’)? 

((’LEFT  ’  1  ’FIGHT  ’)  ’JUSTIFIED  ’) 

1  ’CENTEPED  ’)? 

(36)  <d i splay_liFt>  ::=  ’LISTING  ’  <display_item> 

(’,’  <display__itGm>)  * 

(3"^)  <display_it0 m>  :  :=  <spacing_con''Tol>?  <forffiat>? 

<exprGssion>  <hGading>? 

(38)  <spacing_cor trol>  :  :=  ’X(’  <*N>  ’)’ 

(3C)  <format>  ::=  C'^C  |  *E(’  |  ’A(’  |  'E(’) 

<*N>  (’  ,  ’  <*N>)  ?  *  )  * 

(aC)  <heading>  :  :=  ’/’  (<*S>)  +  ’/’ 

(U1)  <sort_claus0>  ;:=  ’OPPFPED  ’ 

(’BY  ’  (’LOU  ’  I  ’HIGH  ’) 

<f ield_name> 

<si2inmary_specif  icat  ion>?)  + 


B.2  A  Hockey  Bata  Base  For  The  Bequest  Examples 


For  the  guery  larguage  examples  in  the  next  section, 
consider  a  da-^a  base  of  hockey  statistics  with  the  following 
relations: 

TEAMS  (  TEAMID,  NAME,  LOCATION,  GAMFWON,  GAMETIED,  GAMELOSS, 

GOA.L_FOP,  GAGAINST) 

GAME_PEC(  GAMEID,  HTEAM^ID,  VTEAM_I D, "hSCOPE ,  VSCORE) 

GOA-LIES(  PLA.YEFIU,  GAG.AINST,  EN_GOALS,  SHUTOUTS) 

PLAYEPS (  PLAYEPID,  NAME,  PAMFMBEP,  G_PLAYED,  ASSISTS,  PIM) 
TEAMPLAY(  TEAMID,  PLAy^PIP) 


87 


P.3  0?  2211:1  LE222E2I  0222213 

■  E§.32§.§22  iLhOk  32111^3  !!L3272  f 330221* 

1.  DEFINE  ‘J^CONDITIONI  AS  PAEEMBEP.  =  *0*B  AND  BDA.TE  >  '195112  31  * 

2.  EOF  PLAYEFS  VHEFE  1?.C0NPITI0N  1  OUTPUT  PLAYEFID,  NAME,  EDATE. 

3.  T^OF  PLAYEFS  WHEFE  (PAMEMBEF  =  'O'B  AND  BDATE  >  '  1951  12  31') 
OUTPUT  PLI.YFFID,  NAME,  EDATE. 

а.  CANCEL  MACFO  r^CONDITION  1  . 

5.  DEFINE  FPL?- YEF_TEAM_ASSOCIA':^ION  AS  EOF  TEAMPLAY  WHERE 
PLAYER  ID  =  81  0^1 PU"  TEAM  ID. 

б.  ToPLAYEF_TEAM_ASSOCIATICN  (*00ti9*). 

■  lD.3!ii3I  I§.32§§1§* 

7.  LIST  SYNTAX  <USEP_COM MAND> 

8  .  WHAT  APE  FILES  ? 

^5.  WFA'T’  APE  FJT'xps  TEAMID,  PLA.YEFID  IN  TEAMPLAY? 

10.  LIST  FIELDS  IN  FIL^  GOALIES. 


■  2323  23§§  I§923§2§* 


11.  FOP  EACH  FFCOFD  IN  FILE  TEAMS  WHERE  NAME= *  CAN ADIENS *  OUTPUT 
GAMEWON,  GAMELOSS,  GAMETIED,  GCAL_EOF.,  GAGAINST. 

12.  FOP  RECORDS  IN  GOALIES  WHERE  GAGAINST  >=  10  OR  EN__GOALS  >  0 

AND  SHUTOUTS  =>  1  OUTPUT  'GOALIE  *,  PLAYERID,  GAGAINST, 

^N_GOALS,  SHUTOUTS. 

13.  FOR  GAMF_PEC  INSERT  GAMEID=*Z7',  HTEAM_ID= ' B3 ' ,  HSC0RE=9, 
VTEAM_ID=* A1 * ,  VSC0PE=*7». 

14.  FOR  PLAYEFS  COUPLED  BY  (PLAYERID, PLAYEPID)  TO  TEAMPLAY  WHERE 

TEAMID  IS  ONE  OF  (TEAMID  IN  TEAMS  WHERE  LOCATION= ' MONTREAL* ) 
OUTPUT  TO  PFINTEP  WITH  REPORT  HEADING  /'MONTREAL* 

'CANADIENS*/  SEPARATED  CENTERED  X(15)  TEAMID  /'TEAM'/,  A  (15) 
NAME  /'NAME*/,  X  ( 1 0)  PAMEMBEF,  X(10)  G  PLAYED  /'GAMES' 
'PLAYED'/,  X(5)  A(S)  GOALS  /'GOALS*  'SCORED'/,  X(10)  F(7,1) 

ASSISTS  /'ASSISTS'/,  E(14,6)  PIM  /'PENA.LTY'  'IN  MINS*/ 

ORDERED  BY  LOW  BDAT"=’  BY  HIGH  NAME. 


riM^VF^SITY  0?  TOPONTO 


rO?^PUTFF.  SYSTFf^S  PESFAPCH  GROUP 


BI3LI OGRZIPHY  OF  CSFG  TFCHNICAL  PEPOPTS  + 


CSPG-1  F'^PIPICAL  COr^PAPISON  OF  LP(k)  D  PRECEDENCE  P^vRSERS 
J.J,  Horning  and  W.R,  Lalonde,  September  1970 
[ACM  SIGPLAN  Notices,  November  1970] 

CSRG-2  AN  EFFICIENT  LALP  PAPSEP  GENERATOR 
W.P.  Lalond^,  February  1971 
ry.,A.Sc,  Thesis,  FF  1971] 


*  CSPG-3  A.  PROCESSOR  GFNEPATOP  SYSTEM 

J.D.  Gorrie,  February  1971 
.A, 5c.  Thesis,  FF  1971] 

*  CSPG-4  DYLAN  USER'S  MANUAL 

P.F.  Ponzon,  Ma^^ch  19R1 


C3PG-5  DIAL  -  A  PROGRAMMING  SYSTEM  FOP  INTERACTIVE  ALGEBRAIC 
MANIPULATION 

Alan  C.M.  Prown  and  J.J.  Horning,  March  1971 


*  CSRG-6  ON  DEADLOCK  IN  COMPUTER  SYSTEMS 
Richard  C.  Holt,  April  1971 
[Ph.D.  Thesis,  Dept,  of  Computer  Science, 

Cornell  University,  1971] 

ESRG-I  THE  STAR-PING  SYSTEM  OF  LOOSELY  COUPLED  DIGITAL  DEVICES 
John  Neill  Thomas  Potvin,  August  1971 
[M.A.Sc.  Thesis,  FF  1971] 


CSRG-R  FILE  ORGANIZATION  AND  STRUCTURE 
G.M.  Stacey,  August  1971 

CSRG-9  DESIGN  STUDY  FOR  A  TWO-DIMENSIONAL  COMPUTER- ASS  IS  IE D 
ANIMATION  SYSTEM 
Kenneth  3,  Fvans,  January  1972 
[M.Sc.  Thesis,  DCS  1972] 


*CSPG-10  HOW  A  PROGRAMMING  LANGUAGE  IS  USED 

William  Gregg  .Alexander,  February  1972 
[M.Sc.  Thesis,  DCS  1971] 


CSRG-11  PPOJEC'^  SUE  STATUS  REPORT 

J.W.  Atvood  (ed.),  April  1972 

CSPG-1 2  THFFE  DIMFNSIONAX  DATA  DISPLAY  WITH  HIDDEN  LINE  REMOVAL 
Rupert  Bramall,  A.pril  1972 
[ M. Sc.  Thesis ,  DCS  1971  ] 


t  Abbreviations: 

DCS  -  Department  of  Computer  Science,  University  of  Toronto 
”F  -  Department  of  Electrical  Engineering,  University  of 
Tor ontD 

*  -  Out  of  print 


cspr, 

csen 

CSRG 

CSRG 

CR  P  G 

CSP.G 

"^SFG 

GSRG 

CPPG- 

*  CSPG' 

CSPG' 


■13  A  SYNTAX  DIRECTED  FPROP  PECOVEFY  METHOD 
Lewis  P.  James,  May  1972 
fM.Sc.  '^hesis,  DCS  1972  ] 

14  THE  nSE  OF  SEPVTCE  TIME  DISTRIBUTIONS  IN  SCHEDULING 
Kenneth  C.  Sevcik,  May  1972 

[Ph.D.  Thesis,  Committee  on  Information  Sciences, 
University  of  Chicago,  1971;  JACM,  January  1974"] 

•15  PROCESS  STFUCTUPING 

J. J.  Horning  and  B,  Pandell,  June  1972 
[ACM  Computing  Surveys,  March  1973] 

16  OPTIMAL  PROCESSOR  SCHEDULING  WHEN  SERVICE  TIMES  ARE 
HYPEREXPONENTIALLY  DISTRIBUTED  AND  PREEMTION  OVERHEA.D 
IS  NOT  NEGLIGIBLE 

Kenneth  C,  Sevcik,  June  1972 

[Proceedings  of  the  Symposium  on  Computer-Communication, 
Networks  and  Teletraffic, 

Polytechnic  Institute  of  Brooklyn,  1972] 

17  PROGRAMMING  LANGUAGE  TPANSLA.TION  TECHNIQUES 
W.M.  McKeeman,  July  1972 

18  A  COMPAPA'^IVE  ANALYSIS  OF  SEVERAL  DISK  SCHEDULING 
ALGORITHMS 

C.J.M,  Turnbull,  September  1972 

19  PROJECT  SUE  AS  A  LEARNING  EXPERIENCE 

K, C.  Sevcik  et  al,  September  1972 

[Proceedings  A.FIPS  Fail  Joint  Computer  Conference, 

V.  41,  December  1972] 

20  A  STUDY  OF  LA.NGUAGE  DIRECTED  COMPUTER  DESIGN 
David  B.  Woctman,  D«=cemher  1972 

[Ph.D.  Thesis,  Computer  Science  Department, 

Stanford  University,  1972] 

21  AN  APL  TERMINAL  APPROACH  TO  COMPUTER  MAPPING 

F.  Kvatprnik,  December  1972 
[M.Sc.  Thesis,  DCS  1972] 

22  AN  IMPLEMENTATION  LANGUAGE  FOR  MINICOMPUTERS 

G. G.  Kalmar,  January  1973 
[M.Sc.  Thesis,  DCS  1972  ] 

23  COMPILER  STRUCTURE 

W. M.  McKeeman,  January  1973 

[Proceedings  of  the  USA-Japan  Computer  Conference,  1972] 
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J.D.  Gannon  (ed.),  March  1973 


rSR  G- 2  5 


*  CS'’G-2  6 


*  GSPG-2'7 

CSFG-28 

’i'  CSPG-29 

GSPG-30 


C5RG-3 1 

CSPG-32 

CSRG-33 

*  CSPG-34 

G3RG-35 

GSRG- 36 

GSPG- 37 

*  GSRG-38 

CSPG-39 


THF  TNV^S^IGA-TION  OF  SERVICE  TIME  PIS'^E IBUTIONS 
Fleanor  A,  Lester,  April  1973 
[M.Sc.  Thesis,  DCS  1973] 

PSYCHOLOGICAL  COMPLEXITY  OF  COMPUTER  PROGRAMS: 

AN  INITIAL  experiment 
Larry  Ueissroan,  August  1973 

STPIJCTUPED  SUBSETS  OE  THF  PL/I  LANGUAGE 

Richard  C.  Holt  and  David  B,  Worttnan,  October  1973 

ON  THE  PEDUCFD  MATRIX  REPRESENTATION  OF  LR(k) 

PAP  SEP  'CABLES 

Marc  Lcuis  Joliat,  October  1973 
[Ph.D.  '"hesis,  EE  1973  ] 

A  STUDENT  PROJECT  FOR  AN  OPERATING  SYSTEMS  COURSE 
B.  Czarnik  and  D,  Tsichritzis  (eds.)/  Noveiiber  1973 

A  PSEUDO-MACHINE  FOR  CODE  GENERATION 
Henry  John  Pasko,  December  1973 
fM.Sc.  Thesis,  DCS  1973] 

AN  ANNOTATED  BIBLIOGRAPHY  ON  COMPUTER  PROGRAM 
ENGINEFRING 

J.D.  Gannoa  (ed,)#  Second  Edition,  March  1974 

SCHEDULING  MULTIPLE  RESOURCE  COMPUTER  SYSTEMS 

E. D.  Lazowska,  May  1974 
[M.Sc.  Thesis,  DCS  1974] 

AN  EDUCATIONAL  DATA  BASE  MANAGEMENT  SYSTEM 

F.  Lochovsky  and  D.  Tsichritzis,  May  1974 

ALLOCATING  STORAGE  IN  HIERARCHICAL  DATA  BASES 
P.  Bernstein  and  D.  Tsichritzis,  May  1974 

ON  IMPLEMENTATION  OF  RELATIONS 
D.  Tsichritzis,  May  1974 

SIX  PL/I  COMPILERS 

D.B.  Wortraan,  P.J.  Khaiat,  and  D.M.  Lasker 
August  1974 

A  METHODOLOGY  FOR  STUDYING  THE  PSYCHOLOGICAL  COMPLEXITY 

OF  COMPUTER  PROGRAMS 

Laurence  M.  Weissman,  A.ugust  1974 

[Ph.D.  Thesis,  DCS  1974] 

AN  INVESTIGATION  OF  A  NEW  METHOD  OF  CONSTRUCTING 
SOFTWARE 

David  M.  Lasker,  September  1974 
[M.Sc.  Thesis,  DCS  1974] 

AN  ALGEBRAIC  MODEL  EOF  STRING  PATTERNS 
Glenn  F.  Stewart,  September  1974 
TM.Sc.  Thesis,  DCS,  1974] 


CSPG-40  EOnCATJONr'.L  DA.TA  B^SE  SYSTEM  USER’S  MANUAL 
J.  Klebanoff,  .  Lochovsky,  A.  Rozitis,  and 

D.  Tsichritzis,  September  1974 

CSRG-ai  NOTES  FROM  A  WORKSHOP  ON  THE  ATTAINMENT  DF 
PELTAP-LE  SOFTWARE 

David  B.  Wortroan  {‘=^d.),  September  1974 

CSRG-42  THE  PROJECT  SUE  SYSTEM  LANGUAGE  REFERENCE  MANUAL 
B.L,  Clark  and  f.j.b.  Ham,  September  1974 

CSPG-43  A  DATA  BAS^  PROCESSOR 

E. A.  Ozkarahan,  S.A.  Schuster  and  K.C.  Smith, 
November  1974 


CSPG-44  MATCHING  PROGRAM  AND  DATA  REPRESENTATION  TO  A 
COMPUTING  ENVTPONMEN'T-- 
Eric  C.F.  Hehner,  November  1974 
[Ph.D.  Thesis,  DCS,  1974] 

CSRG-4S  THREE  APPROACHES  TO  RELIABLE  SOFTWARE;  LANGUAGE 

DESIGN,  DYADIC  SPECIFICATION,  COMPLEMENTARY  SEMANTICS 
J.E.  Donahue,  J. D,  Gannon,  J.V.  Guttag  and 
J.J.  Horning,  December  1974 


CSPG-46  THE  SYNTHESIS  OF  OPTIMAL  DECISION  TREES  FROM 
DECISION  TABLES 

Helmut  Schumacher,  December  1974 
fM.Sc.  Thesis,  DCS,  1974  ] 

CSRG-47  LANGUAGE  DESIGN  TO  ENHANCE  PROGRAMMING  RELIABILITY 
John  D.  Gannon,  January  1975 
[Ph.D.  Thesis,  DCS,  1975] 

CSRG-48  DETERMINISTIC  LEFT  TO  RIGHT  PA^RSING 

Christopher  J.M.  Turnbull,  January  1975 
[Ph.D.  Thesis,  EE,  1974] 

CSRG-49  A  NETWORK  FRAMEWORK  FOR  RELATIONAL  IMPLEMENTATION 
D.  Tsichritzis,  February  1975 

CSRG-50  A  UNIFIED  APPROACH  TO  FUNCTIONAL  DEPENDENCIES 
AND  RELATIONS 

P.A.  Bernstein,  J.R.  Swenson  and  D.C.  Tsichritzis 
February  1975 

CSRG-51  ZETA;  A  PROTOTYPE  RELATIONAL  DATA  BASE 
MANAGEMENT  SYSTEM 
M.  Brodie  (ed)  . 

February  19^5 
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